llvm-capstone/llvm/utils/TableGen/SequenceToOffsetTable.h
Rot127 c0317ac800 Rebase refactored TableGen backends onto LLVM 18.
The MCInstDesc table changed. Bsides this only minor changes were done
and some additional code is emitted now for LLVM.

This commit is the combination of all previous Auto-Sync commits.
The list of commit messages follows:

-----------

Combination of all commits of the refactored tablegen backends.

These are the changes made for LLVM 16.

Refactor Capstone relevant TableGen Emitter backends.

This commit extracts the code which emits generated tables into two printer classes.
The Printer is called whenever actual code is written to a file.
There is the PrinterLLVM which emits tht code as before and
PrinterCapstone which is tailored to or needs (emitting C and generates
more info).

Additionally missing memory access properties were added to ARMs td
files.

Emit a single header for all files.

Captialize Target name for enums.

Add lay metric to emit enum value for Banked and system regs.

Malloc substr

Sort instructions in ascending order.

Free substr after use

Add vanished constrainsts

Fix `regInfoEmitEnums()` and indent

Fix `GenDisassemblerTables.inc#checkDecoderPredicate()`

Fix `TriCoreGenRegisterInfo.inc` | `PrinterCapstone::regInfoEmitRegClasses`

revert changes to NEON instructions

Add instructions with duplicate operands as Matchables.

Add memory load and store info

Correct memory access and out operand info

Set register lists again as read ops due to https://github.com/llvm/llvm-project/issues/62455

Make printAliasInstr and getMnemonic static.

Generate CS instruction enums from actual mnemonic. Not via the flawed AsmMatcher.

Fix typo in InstrInfoEmitter.cpp

Add deprecated QPX feature

Replace + and - with p and m

Add AssemblerPredicates to PPC

Generate RegEncodingTable

Define functions which are called by the Mapper as static.

Necessary because these functions are present in each arch'

Remove set_mem_access().

The cases where this is used to mark access to actual memory operands are
either very rare, or those are neon lane indicies.

Generate correct op type for absolute addresses.

Check for RegisterPointer operands first to prevent mis-categorization.

Add missing Operand types

Generate Instruction formats for PPC.

Add Paired Single instructions.

Partly revert 94e41ce23a7fd863a96288ec05b6c7202c3cfbf1 (introduces accidentially removed code.)

Set correct operand types for PS operands

Add memory read/write attributes

Add missing operand types

Add mayLoad and mayStore information.

Add documentation.

Handle special AArch64 operand

Replace C++ with C code.

Check for duplicate enum instr. names

Check for duplicate defintions of system registers.

Add note about missing target names.

Resolve templates in a single static method and add docs about it.

Revert printing target name in upper case.

Revert partially C++ syntax fixes in .td files.

They break the TemplateCOllector since it searches for exactly those references but can't find any'

Add all SubtargetFeatures to feature enum.

Not just the one used by CGIs.

Pass Decoder

Enable to check specific table fields to determine if reg enum must be emitted.

Allow to add namespace to type name/

Formatting

Rework emitting of tables.

The system operands are now emitted in reg, imm and aliass groups.
Also a bug was fixed which emitted incorrect code..

Check for rename IMPLICIT_IMM operand types

Pass DecodeComplete as pointer not as reference

Print undef when it needs to be printed.

Add namespace ids to all types and functions.

Rework C translation.

Pass MCOp as pointer not as ref

Add missing SysImm type

Fix syntax mistakes

Generate additonal sys immediates and op groups.

Handle edge case for printSVERegOp

Handle default arguments of template functions.

Add two missing op groups

Generate a static RecEncodingTable

Set enum values to encodings of the sys ops

Generate a single Enum value file for system operands.

Replace System operand groups with their operand types

Fix missing braces warning

Emit MCOperand validator.

Emit lookupByName functions for sys operands

Add namespaces for ARM.

Check for Target if default arguments of template functions are resolved.

auto-sync opcode & operand encoding info generation (#14)

* Added operand and opcode info generation

* Wrapped deprecated macro under an intellisense check

Basically intellisense fails, causing multiple errors in other files,

so when intellisense parses the code it will use the different version of the macro

* Fixed a small bug

Used double braces to prevent an old bug

Removed extra new line and fixed a bug regarding move semantics
2024-05-29 08:31:35 +00:00

254 lines
8.0 KiB
C++

//===-- SequenceToOffsetTable.h - Compress similar sequences ----*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
//
// SequenceToOffsetTable can be used to emit a number of null-terminated
// sequences as one big array. Use the same memory when a sequence is a suffix
// of another.
//
//===----------------------------------------------------------------------===//
#ifndef LLVM_UTILS_TABLEGEN_SEQUENCETOOFFSETTABLE_H
#define LLVM_UTILS_TABLEGEN_SEQUENCETOOFFSETTABLE_H
#include "PrinterTypes.h"
#include "llvm/Support/CommandLine.h"
#include "llvm/Support/raw_ostream.h"
#include <algorithm>
#include <cassert>
#include <cctype>
#include <functional>
#include <map>
namespace llvm {
extern llvm::cl::opt<bool> EmitLongStrLiterals;
static inline void printChar(raw_ostream &OS, char C) {
unsigned char UC(C);
if (isalnum(UC) || ispunct(UC)) {
OS << '\'';
if (C == '\\' || C == '\'')
OS << '\\';
OS << C << '\'';
} else {
OS << unsigned(UC);
}
}
/// SequenceToOffsetTable - Collect a number of terminated sequences of T.
/// Compute the layout of a table that contains all the sequences, possibly by
/// reusing entries.
///
/// @tparam SeqT The sequence container. (vector or string).
/// @tparam Less A stable comparator for SeqT elements.
template<typename SeqT, typename Less = std::less<typename SeqT::value_type> >
class SequenceToOffsetTable {
typedef typename SeqT::value_type ElemT;
// Define a comparator for SeqT that sorts a suffix immediately before a
// sequence with that suffix.
struct SeqLess {
Less L;
bool operator()(const SeqT &A, const SeqT &B) const {
return std::lexicographical_compare(A.rbegin(), A.rend(),
B.rbegin(), B.rend(), L);
}
};
// Keep sequences ordered according to SeqLess so suffixes are easy to find.
// Map each sequence to its offset in the table.
typedef std::map<SeqT, unsigned, SeqLess> SeqMap;
// Sequences added so far, with suffixes removed.
SeqMap Seqs;
// Entries in the final table, or 0 before layout was called.
unsigned Entries;
// The output language of the table.
PrinterLanguage PL;
// If set it will wrap the table content into a #ifndef CAPSTONE_DIET guard;
bool CSDietGuard;
// isSuffix - Returns true if A is a suffix of B.
static bool isSuffix(const SeqT &A, const SeqT &B) {
return A.size() <= B.size() && std::equal(A.rbegin(), A.rend(), B.rbegin());
}
public:
SequenceToOffsetTable() : Entries(0), PL(PRINTER_LANG_CPP), CSDietGuard(false) {}
SequenceToOffsetTable(PrinterLanguage PL, bool CSDiet = false) : Entries(0), PL(PL), CSDietGuard(CSDiet) {}
/// add - Add a sequence to the table.
/// This must be called before layout().
void add(const SeqT &Seq) {
assert(Entries == 0 && "Cannot call add() after layout()");
typename SeqMap::iterator I = Seqs.lower_bound(Seq);
// If SeqMap contains a sequence that has Seq as a suffix, I will be
// pointing to it.
if (I != Seqs.end() && isSuffix(Seq, I->first))
return;
I = Seqs.insert(I, std::make_pair(Seq, 0u));
// The entry before I may be a suffix of Seq that can now be erased.
if (I != Seqs.begin() && isSuffix((--I)->first, Seq))
Seqs.erase(I);
}
bool empty() const { return Seqs.empty(); }
unsigned size() const {
assert((empty() || Entries) && "Call layout() before size()");
return Entries;
}
/// layout - Computes the final table layout.
void layout() {
assert(Entries == 0 && "Can only call layout() once");
// Lay out the table in Seqs iteration order.
for (typename SeqMap::iterator I = Seqs.begin(), E = Seqs.end(); I != E;
++I) {
I->second = Entries;
// Include space for a terminator.
Entries += I->first.size() + 1;
}
}
/// get - Returns the offset of Seq in the final table.
unsigned get(const SeqT &Seq) const {
assert(Entries && "Call layout() before get()");
typename SeqMap::const_iterator I = Seqs.lower_bound(Seq);
assert(I != Seqs.end() && isSuffix(Seq, I->first) &&
"get() called with sequence that wasn't added first");
return I->second + (I->first.size() - Seq.size());
}
void emitStringLiteralDef(raw_ostream &OS, const llvm::Twine &Decl) const {
switch (PL) {
default:
llvm_unreachable("Language not specified to print table in.");
case PRINTER_LANG_CPP:
emitStringLiteralDefCPP(OS, Decl);
break;
case PRINTER_LANG_CAPSTONE_C:
emitStringLiteralDefCCS(OS, Decl);
break;
}
}
void emit(raw_ostream &OS,
void (*Print)(raw_ostream&, ElemT),
const char *Term = "0") const {
switch (PL) {
default:
llvm_unreachable("Language not specified to print table in.");
case PRINTER_LANG_CPP:
emitCPP(OS, Print, Term);
break;
case PRINTER_LANG_CAPSTONE_C:
emitCCS(OS, Print, Term);
break;
}
}
/// `emitStringLiteralDef` - Print out the table as the body of an array
/// initializer, where each element is a C string literal terminated by
/// `\0`. Falls back to emitting a comma-separated integer list if
/// `EmitLongStrLiterals` is false
void emitStringLiteralDefCPP(raw_ostream &OS, const llvm::Twine &Decl) const {
assert(Entries && "Call layout() before emitStringLiteralDef()");
if (!EmitLongStrLiterals) {
OS << Decl << " = {\n";
emit(OS, printChar, "0");
OS << " 0\n};\n\n";
return;
}
OS << "\n#ifdef __GNUC__\n"
<< "#pragma GCC diagnostic push\n"
<< "#pragma GCC diagnostic ignored \"-Woverlength-strings\"\n"
<< "#endif\n"
<< Decl << " = {\n";
for (auto I : Seqs) {
OS << " /* " << I.second << " */ \"";
OS.write_escaped(I.first);
OS << "\\0\"\n";
}
OS << "};\n"
<< "#ifdef __GNUC__\n"
<< "#pragma GCC diagnostic pop\n"
<< "#endif\n\n";
}
void emitStringLiteralDefCCS(raw_ostream &OS, const llvm::Twine &Decl) const {
assert(Entries && "Call layout() before emitStringLiteralDef()");
if (!EmitLongStrLiterals) {
if (CSDietGuard)
OS << "#ifndef CAPSTONE_DIET\n";
OS << Decl << " = {\n";
emit(OS, printChar, "0");
OS << " 0\n};\n";
if (CSDietGuard)
OS << "#endif // CAPSTONE_DIET\n\n";
OS << "\n";
return;
}
if (CSDietGuard)
OS << "#ifndef CAPSTONE_DIET\n";
OS << Decl << " = {\n";
for (auto I : Seqs) {
OS << " /* " << I.second << " */ \"";
OS.write_escaped(I.first);
OS << "\\0\"\n";
}
OS << "};\n";
if (CSDietGuard)
OS << "#endif // CAPSTONE_DIET\n\n";
}
/// emit - Print out the table as the body of an array initializer.
/// Use the Print function to print elements.
void emitCPP(raw_ostream &OS,
void (*Print)(raw_ostream&, ElemT),
const char *Term) const {
assert((empty() || Entries) && "Call layout() before emit()");
for (typename SeqMap::const_iterator I = Seqs.begin(), E = Seqs.end();
I != E; ++I) {
OS << " /* " << I->second << " */ ";
for (typename SeqT::const_iterator SI = I->first.begin(),
SE = I->first.end(); SI != SE; ++SI) {
Print(OS, *SI);
OS << ", ";
}
OS << Term << ",\n";
}
}
void emitCCS(raw_ostream &OS,
void (*Print)(raw_ostream&, ElemT),
const char *Term) const {
assert((empty() || Entries) && "Call layout() before emit()");
for (typename SeqMap::const_iterator I = Seqs.begin(), E = Seqs.end();
I != E; ++I) {
OS << " /* " << I->second << " */ ";
for (typename SeqT::const_iterator SI = I->first.begin(),
SE = I->first.end(); SI != SE; ++SI) {
Print(OS, *SI);
OS << ", ";
}
OS << Term << ",\n";
}
}
};
} // end namespace llvm
#endif