Files
archived-ballistic/spec/arm64_xml/usmops_za_pp_zz.xml
Ronald Caesar 26a677f8b4 decoder: Add ARM specification docs
Signed-off-by: Ronald Caesar <github43132@proton.me>
2025-12-12 18:11:36 -04:00

288 lines
17 KiB
XML
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" encoding="UTF-8" href="iform.xsl" version="1.0"?>
<!DOCTYPE instructionsection PUBLIC "-//ARM//DTD instructionsection //EN" "iform-p.dtd">
<!-- Copyright (c) 2010-2022 Arm Limited or its affiliates. All rights reserved. -->
<!-- This document is Non-Confidential. This document may only be used and distributed in accordance with the terms of the agreement entered into by Arm and the party that Arm delivered this document to. -->
<instructionsection id="usmops_za_pp_zz" title="USMOPS" type="instruction">
<docvars>
<docvar key="instr-class" value="mortlach" />
<docvar key="isa" value="A64" />
<docvar key="mnemonic" value="USMOPS" />
</docvars>
<heading>USMOPS</heading>
<desc>
<brief>Unsigned by signed integer sum of outer products and subtract</brief>
<description>
<para>The 8-bit integer variant works with a 32-bit element ZA tile.</para>
<para>The 16-bit integer variant works with a 64-bit element ZA tile.</para>
<para>The unsigned by signed integer sum of outer products and subtract instructions multiply the sub-matrix in the first source vector by the sub-matrix in the second source vector. In case of the 8-bit integer variant, the first source holds SVL<sub>S</sub>×4 sub-matrix of unsigned 8-bit integer values, and the second source holds 4×SVL<sub>S</sub> sub-matrix of signed 8-bit integer values. In case of the 16-bit integer variant, the first source holds SVL<sub>D</sub>×4 sub-matrix of unsigned 16-bit integer values, and the second source holds 4×SVL<sub>D</sub> sub-matrix of signed 16-bit integer values.</para>
<para>Each source vector is independently predicated by a corresponding governing predicate. When an 8-bit source element in case of 8-bit integer variant or a 16-bit source element in case of 16-bit integer variant is Inactive, it is treated as having the value 0.</para>
<para>The resulting SVL<sub>S</sub>×SVL<sub>S</sub> widened 32-bit integer or SVL<sub>D</sub>×SVL<sub>D</sub> widened 64-bit integer sum of outer products is then destructively subtracted from the 32-bit integer or 64-bit integer destination tile, respectively for 8-bit integer and 16-bit integer instruction variants. This is equivalent to performing a 4-way dot product and subtract from each of the destination tile elements.</para>
<para>In case of the 8-bit integer variant, each 32-bit container of the first source vector holds 4 consecutive column elements of each row of a SVL<sub>S</sub>×4 sub-matrix, and each 32-bit container of the second source vector holds 4 consecutive row elements of each column of a 4×SVL<sub>S</sub> sub-matrix. In case of the 16-bit integer variant, each 64-bit container of the first source vector holds 4 consecutive column elements of each row of a SVL<sub>D</sub>×4 sub-matrix, and each 64-bit container of the second source vector holds 4 consecutive row elements of each column of a 4×SVL<sub>D</sub> sub-matrix.</para>
<para>ID_AA64SMFR0_EL1.I16I64 indicates whether the 16-bit integer variant is implemented.</para>
</description>
<status>Green</status>
<predicated>True</predicated>
<uses_dit condition="FEAT_SVE2 is implemented or FEAT_SME is implemented">True</uses_dit>
<sm_policy>SM_1_only</sm_policy>
<is_gov_pred_pair>True</is_gov_pred_pair>
</desc>
<alias_list howmany="0"></alias_list>
<classes>
<classesintro count="2">
<txt>It has encodings from 2 classes:</txt>
<a href="#iclass_per_word">32-bit</a>
<txt> and </txt>
<a href="#iclass_per_doubleword">64-bit</a>
</classesintro>
<iclass name="32-bit" oneof="2" id="iclass_per_word" no_encodings="1" isa="A64">
<docvars>
<docvar key="asimdimm-datatype" value="per-word" />
<docvar key="instr-class" value="mortlach" />
<docvar key="isa" value="A64" />
<docvar key="mnemonic" value="USMOPS" />
</docvars>
<iclassintro count="1"></iclassintro>
<arch_variants>
<arch_variant name="FEAT_SME" feature="FEAT_SME" />
</arch_variants>
<regdiagram form="32" psname="USMOPS-ZA.PP.ZZ-32" tworows="1">
<box hibit="31" width="2" settings="2">
<c>1</c>
<c>0</c>
</box>
<box hibit="29" width="5" settings="5">
<c>1</c>
<c>0</c>
<c>0</c>
<c>0</c>
<c>0</c>
</box>
<box hibit="24" name="u0" usename="1" settings="1">
<c>1</c>
</box>
<box hibit="23" width="2" settings="2">
<c>1</c>
<c>0</c>
</box>
<box hibit="21" name="u1" usename="1" settings="1">
<c>0</c>
</box>
<box hibit="20" width="5" name="Zm" usename="1">
<c colspan="5"></c>
</box>
<box hibit="15" width="3" name="Pm" usename="1">
<c colspan="3"></c>
</box>
<box hibit="12" width="3" name="Pn" usename="1">
<c colspan="3"></c>
</box>
<box hibit="9" width="5" name="Zn" usename="1">
<c colspan="5"></c>
</box>
<box hibit="4" name="S" usename="1" settings="1">
<c>1</c>
</box>
<box hibit="3" settings="1">
<c>0</c>
</box>
<box hibit="2" settings="1">
<c>0</c>
</box>
<box hibit="1" width="2" name="ZAda" usename="1">
<c colspan="2"></c>
</box>
</regdiagram>
<encoding name="usmops_za_pp_zz_32" oneofinclass="1" oneof="2" label="">
<docvars>
<docvar key="asimdimm-datatype" value="per-word" />
<docvar key="instr-class" value="mortlach" />
<docvar key="isa" value="A64" />
<docvar key="mnemonic" value="USMOPS" />
</docvars>
<asmtemplate><text>USMOPS </text><a link="sa_zada" hover="ZA tile ZA0-ZA3 (field &quot;ZAda&quot;)">&lt;ZAda&gt;</a><text>.S, </text><a link="sa_pn" hover="First governing scalable predicate register P0-P7 (field &quot;Pn&quot;)">&lt;Pn&gt;</a><text>/M, </text><a link="sa_pm" hover="Second governing scalable predicate register P0-P7 (field &quot;Pm&quot;)">&lt;Pm&gt;</a><text>/M, </text><a link="sa_zn" hover="First source scalable vector register (field &quot;Zn&quot;)">&lt;Zn&gt;</a><text>.B, </text><a link="sa_zm" hover="Second source scalable vector register (field &quot;Zm&quot;)">&lt;Zm&gt;</a><text>.B</text></asmtemplate>
</encoding>
<ps_section howmany="1">
<ps name="USMOPS-ZA.PP.ZZ-32" mylink="USMOPS-ZA.PP.ZZ-32" enclabels="" sections="1" secttype="noheading">
<pstext mayhavelinks="1" section="Decode" rep_section="decode">if !<a link="impl-aarch64.HaveSME.0" file="shared_pseudocode.xml" hover="function: boolean HaveSME()">HaveSME</a>() then UNDEFINED;
constant integer esize = 32;
integer a = <a link="impl-shared.UInt.1" file="shared_pseudocode.xml" hover="function: integer UInt(bits(N) x)">UInt</a>(Pn);
integer b = <a link="impl-shared.UInt.1" file="shared_pseudocode.xml" hover="function: integer UInt(bits(N) x)">UInt</a>(Pm);
integer n = <a link="impl-shared.UInt.1" file="shared_pseudocode.xml" hover="function: integer UInt(bits(N) x)">UInt</a>(Zn);
integer m = <a link="impl-shared.UInt.1" file="shared_pseudocode.xml" hover="function: integer UInt(bits(N) x)">UInt</a>(Zm);
integer da = <a link="impl-shared.UInt.1" file="shared_pseudocode.xml" hover="function: integer UInt(bits(N) x)">UInt</a>(ZAda);
boolean sub_op = TRUE;
boolean op1_unsigned = TRUE;
boolean op2_unsigned = FALSE;</pstext>
</ps>
</ps_section>
</iclass>
<iclass name="64-bit" oneof="2" id="iclass_per_doubleword" no_encodings="1" isa="A64">
<docvars>
<docvar key="asimdimm-datatype" value="per-doubleword" />
<docvar key="instr-class" value="mortlach" />
<docvar key="isa" value="A64" />
<docvar key="mnemonic" value="USMOPS" />
</docvars>
<iclassintro count="1"></iclassintro>
<arch_variants>
<arch_variant name="FEAT_SME_I16I64" feature="FEAT_SME_I16I64" />
</arch_variants>
<regdiagram form="32" psname="USMOPS-ZA.PP.ZZ-64" tworows="1">
<box hibit="31" width="2" settings="2">
<c>1</c>
<c>0</c>
</box>
<box hibit="29" width="5" settings="5">
<c>1</c>
<c>0</c>
<c>0</c>
<c>0</c>
<c>0</c>
</box>
<box hibit="24" name="u0" usename="1" settings="1">
<c>1</c>
</box>
<box hibit="23" width="2" settings="2">
<c>1</c>
<c>1</c>
</box>
<box hibit="21" name="u1" usename="1" settings="1">
<c>0</c>
</box>
<box hibit="20" width="5" name="Zm" usename="1">
<c colspan="5"></c>
</box>
<box hibit="15" width="3" name="Pm" usename="1">
<c colspan="3"></c>
</box>
<box hibit="12" width="3" name="Pn" usename="1">
<c colspan="3"></c>
</box>
<box hibit="9" width="5" name="Zn" usename="1">
<c colspan="5"></c>
</box>
<box hibit="4" name="S" usename="1" settings="1">
<c>1</c>
</box>
<box hibit="3" settings="1">
<c>0</c>
</box>
<box hibit="2" width="3" name="ZAda" usename="1">
<c colspan="3"></c>
</box>
</regdiagram>
<encoding name="usmops_za_pp_zz_64" oneofinclass="1" oneof="2" label="">
<docvars>
<docvar key="asimdimm-datatype" value="per-doubleword" />
<docvar key="instr-class" value="mortlach" />
<docvar key="isa" value="A64" />
<docvar key="mnemonic" value="USMOPS" />
</docvars>
<asmtemplate><text>USMOPS </text><a link="sa_zada_1" hover="ZA tile ZA0-ZA7 (field &quot;ZAda&quot;)">&lt;ZAda&gt;</a><text>.D, </text><a link="sa_pn" hover="First governing scalable predicate register P0-P7 (field &quot;Pn&quot;)">&lt;Pn&gt;</a><text>/M, </text><a link="sa_pm" hover="Second governing scalable predicate register P0-P7 (field &quot;Pm&quot;)">&lt;Pm&gt;</a><text>/M, </text><a link="sa_zn" hover="First source scalable vector register (field &quot;Zn&quot;)">&lt;Zn&gt;</a><text>.H, </text><a link="sa_zm" hover="Second source scalable vector register (field &quot;Zm&quot;)">&lt;Zm&gt;</a><text>.H</text></asmtemplate>
</encoding>
<ps_section howmany="1">
<ps name="USMOPS-ZA.PP.ZZ-64" mylink="USMOPS-ZA.PP.ZZ-64" enclabels="" sections="1" secttype="noheading">
<pstext mayhavelinks="1" section="Decode" rep_section="decode">if !<a link="impl-aarch64.HaveSMEI16I64.0" file="shared_pseudocode.xml" hover="function: boolean HaveSMEI16I64()">HaveSMEI16I64</a>() then UNDEFINED;
constant integer esize = 64;
integer a = <a link="impl-shared.UInt.1" file="shared_pseudocode.xml" hover="function: integer UInt(bits(N) x)">UInt</a>(Pn);
integer b = <a link="impl-shared.UInt.1" file="shared_pseudocode.xml" hover="function: integer UInt(bits(N) x)">UInt</a>(Pm);
integer n = <a link="impl-shared.UInt.1" file="shared_pseudocode.xml" hover="function: integer UInt(bits(N) x)">UInt</a>(Zn);
integer m = <a link="impl-shared.UInt.1" file="shared_pseudocode.xml" hover="function: integer UInt(bits(N) x)">UInt</a>(Zm);
integer da = <a link="impl-shared.UInt.1" file="shared_pseudocode.xml" hover="function: integer UInt(bits(N) x)">UInt</a>(ZAda);
boolean sub_op = TRUE;
boolean op1_unsigned = TRUE;
boolean op2_unsigned = FALSE;</pstext>
</ps>
</ps_section>
</iclass>
</classes>
<explanations scope="all">
<explanation enclist="usmops_za_pp_zz_32" symboldefcount="1">
<symbol link="sa_zada">&lt;ZAda&gt;</symbol>
<account encodedin="ZAda">
<docvars>
<docvar key="asimdimm-datatype" value="per-word" />
</docvars>
<intro>
<para>For the 32-bit variant: is the name of the ZA tile ZA0-ZA3, encoded in the "ZAda" field.</para>
</intro>
</account>
</explanation>
<explanation enclist="usmops_za_pp_zz_64" symboldefcount="2">
<symbol link="sa_zada_1">&lt;ZAda&gt;</symbol>
<account encodedin="ZAda">
<docvars>
<docvar key="asimdimm-datatype" value="per-doubleword" />
</docvars>
<intro>
<para>For the 64-bit variant: is the name of the ZA tile ZA0-ZA7, encoded in the "ZAda" field.</para>
</intro>
</account>
</explanation>
<explanation enclist="usmops_za_pp_zz_32, usmops_za_pp_zz_64" symboldefcount="1">
<symbol link="sa_pn">&lt;Pn&gt;</symbol>
<account encodedin="Pn">
<intro>
<para>Is the name of the first governing scalable predicate register P0-P7, encoded in the "Pn" field.</para>
</intro>
</account>
</explanation>
<explanation enclist="usmops_za_pp_zz_32, usmops_za_pp_zz_64" symboldefcount="1">
<symbol link="sa_pm">&lt;Pm&gt;</symbol>
<account encodedin="Pm">
<intro>
<para>Is the name of the second governing scalable predicate register P0-P7, encoded in the "Pm" field.</para>
</intro>
</account>
</explanation>
<explanation enclist="usmops_za_pp_zz_32, usmops_za_pp_zz_64" symboldefcount="1">
<symbol link="sa_zn">&lt;Zn&gt;</symbol>
<account encodedin="Zn">
<intro>
<para>Is the name of the first source scalable vector register, encoded in the "Zn" field.</para>
</intro>
</account>
</explanation>
<explanation enclist="usmops_za_pp_zz_32, usmops_za_pp_zz_64" symboldefcount="1">
<symbol link="sa_zm">&lt;Zm&gt;</symbol>
<account encodedin="Zm">
<intro>
<para>Is the name of the second source scalable vector register, encoded in the "Zm" field.</para>
</intro>
</account>
</explanation>
</explanations>
<ps_section howmany="1">
<ps name="USMOPS-ZA.PP.ZZ-32" mylink="execute" enclabels="" sections="1" secttype="Operation">
<pstext mayhavelinks="1" section="Execute" rep_section="execute"><a link="impl-aarch64.CheckStreamingSVEAndZAEnabled.0" file="shared_pseudocode.xml" hover="function: CheckStreamingSVEAndZAEnabled()">CheckStreamingSVEAndZAEnabled</a>();
constant integer VL = <a link="impl-aarch64.CurrentVL.read.none" file="shared_pseudocode.xml" hover="accessor: integer CurrentVL">CurrentVL</a>;
constant integer PL = VL DIV 8;
constant integer dim = VL DIV esize;
bits(PL) mask1 = <a link="impl-aarch64.P.read.2" file="shared_pseudocode.xml" hover="accessor: bits(width) P[integer n, integer width]">P</a>[a, PL];
bits(PL) mask2 = <a link="impl-aarch64.P.read.2" file="shared_pseudocode.xml" hover="accessor: bits(width) P[integer n, integer width]">P</a>[b, PL];
bits(VL) operand1 = <a link="impl-aarch64.Z.read.2" file="shared_pseudocode.xml" hover="accessor: bits(width) Z[integer n, integer width]">Z</a>[n, VL];
bits(VL) operand2 = <a link="impl-aarch64.Z.read.2" file="shared_pseudocode.xml" hover="accessor: bits(width) Z[integer n, integer width]">Z</a>[m, VL];
bits(dim*dim*esize) operand3 = <a link="impl-aarch64.ZAtile.read.3" file="shared_pseudocode.xml" hover="accessor: bits(width) ZAtile[integer tile, integer esize, integer width]">ZAtile</a>[da, esize, dim*dim*esize];
bits(dim*dim*esize) result;
integer prod;
for row = 0 to dim-1
for col = 0 to dim-1
bits(esize) sum = <a link="impl-shared.Elem.read.3" file="shared_pseudocode.xml" hover="accessor: bits(size) Elem[bits(N) vector, integer e, integer size]">Elem</a>[operand3, row*dim+col, esize];
for k = 0 to 3
if <a link="impl-aarch64.ActivePredicateElement.3" file="shared_pseudocode.xml" hover="function: boolean ActivePredicateElement(bits(N) pred, integer e, integer esize)">ActivePredicateElement</a>(mask1, 4*row + k, esize DIV 4) &amp;&amp;
<a link="impl-aarch64.ActivePredicateElement.3" file="shared_pseudocode.xml" hover="function: boolean ActivePredicateElement(bits(N) pred, integer e, integer esize)">ActivePredicateElement</a>(mask2, 4*col + k, esize DIV 4) then
prod = (<a link="impl-shared.Int.2" file="shared_pseudocode.xml" hover="function: integer Int(bits(N) x, boolean unsigned)">Int</a>(<a link="impl-shared.Elem.read.3" file="shared_pseudocode.xml" hover="accessor: bits(size) Elem[bits(N) vector, integer e, integer size]">Elem</a>[operand1, 4*row + k, esize DIV 4], op1_unsigned) *
<a link="impl-shared.Int.2" file="shared_pseudocode.xml" hover="function: integer Int(bits(N) x, boolean unsigned)">Int</a>(<a link="impl-shared.Elem.read.3" file="shared_pseudocode.xml" hover="accessor: bits(size) Elem[bits(N) vector, integer e, integer size]">Elem</a>[operand2, 4*col + k, esize DIV 4], op2_unsigned));
if sub_op then prod = -prod;
sum = sum + prod;
<a link="impl-shared.Elem.write.3" file="shared_pseudocode.xml" hover="accessor: Elem[bits(N) &amp;vector, integer e, integer size] = bits(size) value">Elem</a>[result, row*dim+col, esize] = sum;
<a link="impl-aarch64.ZAtile.write.3" file="shared_pseudocode.xml" hover="accessor: ZAtile[integer tile, integer esize, integer width] = bits(width) value">ZAtile</a>[da, esize, dim*dim*esize] = result;</pstext>
</ps>
</ps_section>
</instructionsection>