capstone/suite/auto-sync
2023-12-28 11:10:38 +09:00
..
inc_patches [ARM] Vector data types (#2151) 2023-09-08 18:55:13 +08:00
Updater Add Alpha architecture (#2071) 2023-12-28 11:10:38 +09:00
vendor Architecture updater (auto-sync) - Updating AArch64 (#2026) 2023-11-15 12:12:14 +08:00
.gitignore Architecture updater (auto-sync) - Updating AArch64 (#2026) 2023-11-15 12:12:14 +08:00
README.md Architecture updater (auto-sync) - Updating AArch64 (#2026) 2023-11-15 12:12:14 +08:00

Architecture updater

This is Capstones updater for some architectures. Unfortunately not all architectures are supported yet.

Install dependencies

Install clang-format

# clang-format version must be at least 16
sudo apt install clang-format-18

Setup Python environment and Tree-sitter

cd <root-dir-Capstone>
# Python version must be at least 3.11
sudo apt install python3-venv
# Setup virtual environment in Capstone root dir
python3 -m venv ./.venv
source ./.venv/bin/activate
pip3 install -r requirements.txt

Clone C++ grammar

cs suite/auto-sync/
git submodule update --init --recursive ./vendor/

Update

Check if your architecture is supported.

./Updater/Updater.py -h

Clone Capstones LLVM fork and build llvm-tblgen

git clone https://github.com/capstone-engine/llvm-capstone
cd llvm-capstone
git checkout auto-sync
mkdir build
cd build
# You can also build the "Release" version
cmake -G Ninja -DCMAKE_BUILD_TYPE=Debug ../llvm
cmake --build . --target llvm-tblgen --config Debug
cd ../../

Run the updater

TODO: REWORK
mkdir build
cd build
../Update-Arch.sh <ARCH> ./llvm-capstone

Post-processing steps

This update translates some LLVM C++ files to C. Because the translation is not perfect (maybe it will some day) you will get build errors if you try to compile Capstone.

The last step to finish the update is to fix those build errors by hand.

Developer

Overview updated files

This is a rough overview what files of an architecture are updated and where they are coming from.

Files originating from LLVM (Automatically updated)

TODO: The "LLVM*" files are not renamed yet.

  • <ARCH>LLVM*.*: These files are LLVM source files which were translated from C++ to C.
    • Because the translation is not perfect, those files need some hands on work afterwards (see below).
  • <ARCH>Gen*.inc: These files are exclusively generated by LLVM TableGen backends.

These files form the actual disassembler and assembler printer.

Capstone module files (Not automatically updated)

  • <ARCH>Mapping.*: Binding code between the architecture module and the LLVM files.
  • <ARCH>Module.*: Interface for the Capstone core.
  • <ARCH>DisassemblerExtension.* All kind of functions which are needed by <ARCH>LLVMDisassembler.c but could not be generated or translated.

Update procedure

  1. Run the Update-Arch.sh script.
  2. Compare the functions in <ARCH>DisassemblerExtension.* to LLVM (search the function names in the LLVM root) and update them if necessary.
  3. Try to build Capstone and fix the build errors.

Update details

LLVM file translation

For details about the C++ to C translation of the LLVM files refer to CppTranslator/README.md.

Generated .inc files

Documentation about the .inc file generation is in the llvm-capstone repository.

  • If some features were not generated and are missing in the .inc files, make sure they are defined as AssemblerPredicate in the .td files.

    Correct:

    def In32BitMode  : Predicate<"!Subtarget->isPPC64()">,
      AssemblerPredicate<(all_of (not Feature64Bit)), "64bit">;
    

    Incorrect:

    def In32BitMode  : Predicate<"!Subtarget->isPPC64()">;
    

Formatting

  • If you make changes to the CppTranslator please format the files with black
    source ./.venv/bin/activate
    pip3 install black
    python3 -m black --line-length=120 CppTranslator/*/*.py