capstone/HACK.TXT

134 lines
5.0 KiB
Plaintext
Raw Normal View History

2019-02-01 23:39:07 +00:00
Code structure
--------------
2013-11-27 06:33:13 +00:00
2019-02-01 23:39:07 +00:00
Capstone source is organized as followings.
2013-12-05 02:25:51 +00:00
. <- core engine + README + COMPILE.TXT etc
2013-11-27 09:00:06 +00:00
├── arch <- code handling disasm engine for each arch
2013-12-05 02:25:51 +00:00
│   ├── AArch64 <- ARM64 (aka ARMv8) engine
│   ├── ARM <- ARM engine
│   ├── BPF <- Berkeley Packet Filter engine
2018-12-12 09:30:45 +00:00
│   ├── EVM <- Ethereum engine
M680X: Target ready for pull request (#1034) * Added new M680X target. Supports M6800/1/2/3/9, HD6301 * M680X: Reformat for coding guide lines. Set alphabetical order in HACK.TXT * M680X: Prepare for python binding. Move cs_m680x, m680x_insn to m680x_info. Chec > k cpu type, no default. * M680X: Add python bindings. Added python tests. * M680X: Added cpu types to usage message. * cstool: Avoid segfault for invalid <arch+mode>. * Make test_m680x.c/test_m680x.py output comparable (diff params: -bu). Keep xprint.py untouched. * M680X: Update CMake/make for m680x support. Update .gitignore. * M680X: Reduce compiler warnings. * M680X: Reduce compiler warnings. * M680X: Reduce compiler warnings. * M680X: Make test_m680x.c/test_m680x.py output comparable (diff params: -bu). * M680X: Add ocaml bindings and tests. * M680X: Add java bindings and tests. * M680X: Added tests for all indexed addressing modes. C/Python/Ocaml * M680X: Naming, use page1 for PAGE1 instructions (without prefix). * M680X: Naming, use page1 for PAGE1 instructions (without prefix). * M680X: Used M680X_FIRST_OP_IN_MNEM in tests C/python/java/ocaml. * M680X: Added access property to cs_m680x_op. * M680X: Added operand size. * M680X: Remove compiler warnings. * M680X: Added READ/WRITE access property per operator. * M680X: Make reg_inherent_hdlr independent of CPU type. * M680X: Add HD6309 support + bug fixes * M680X: Remove errors and warning. * M680X: Add Bcc/LBcc to group BRAREL (relative branch). * M680X: Add group JUMP to BVS/BVC/LBVS/LBVC. Remove BRAREL from BRN/LBRN. * M680X: Remove LBRN from group BRAREL. * M680X: Refactored cpu_type initialization for better readability. * M680X: Add two operands for insn having two reg. in mnemonic. e.g. ABX. * M680X: Remove typo in cstool.c * M680X: Some format improvements in changed_regs. * M680X: Remove insn id string list from tests (C/python/java/ocaml). * M680X: SEXW, set access of reg. D to WRITE. * M680X: Sort changed_regs in increasing m680x_insn order. * M680X: Add M68HC11 support + Reduced from two to one INDEXED operand. * M680X: cstool, also write '(in mnemonic)' for second reg. operand. * M680X: Add BRN/LBRN to group JUMP and BRAREL. * M680X: For Bcc/LBcc/BRSET/BRCLR set reg. CC to read access. * M680X: Correctly print negative immediate values with option CS_OPT_UNSIGNED. * M680X: Rename some instruction handlers. * M680X: Add M68HC05 support. * M680X: Dont print prefix '<' for direct addr. mode. * M680X: Add M68HC08 support + resorted tables + bug fixes. * M680X: Add Freescale HCS08 support. * M680X: Changed group names, avoid spaces. * M680X: Refactoring, rename addessing mode handlers. * M680X: indexed addr. mode, changed pre/post inc-/decrement representation. * M680X: Rename some M6809/HD6309 specific functions. * M680X: Add CPU12 (68HC12/HCS12) support. * M680X: Correctly display illegal instruction as FCB . * M680X: bugfix: BRA/BRN/BSR/LBRA/LBRN/LBSR does not read CC reg. * M680X: bugfix: Correctly check for sufficient code size for M6809 indexed addressing. * M680X: Better support for changing insn id within handler for addessing mode. * M680X: Remove warnings. * M680X: In set_changed_regs_read_write_counts use own access_mode. * M680X: Split cpu specific tables into separate *.inc files. * M680X: Remove warnings. * M680X: Removed address_mode. Addressing mode is available in operand.type * M680X: Bugfix: BSET/BCLR/BRSET/BRCLR correct read/modify CC reg. * M680X: Remove register TMP1. It is first visible in CPU12X. * M680X: Performance improvement + bug fixes. * M680X: Performance improvement, make cpu_tables const static. * M680X: Simplify operand decoding by using two handlers. * M680X: Replace M680X_OP_INDEX by M680X_OP_CONSTANT + bugfix in java/python/ocaml bindings. * M680X: Format with astyle. * M680X: Update documentation. * M680X: Corrected author for m680x specific files. * M680X: Make max. number of architectures single source.
2017-10-21 13:44:36 +00:00
│   ├── M680X <- M680X engine
2015-10-05 08:14:19 +00:00
│   ├── M68K <- M68K engine
2013-12-05 02:25:51 +00:00
│   ├── Mips <- Mips engine
│   ├── MOS65XX <- MOS65XX engine
2014-01-07 03:46:21 +00:00
│   ├── PowerPC <- PowerPC engine
2023-04-24 14:53:03 +00:00
│   ├── RISCV <- RISCV engine
│   ├── SH <- SH engine
│   ├── Sparc <- Sparc engine
│   ├── SystemZ <- SystemZ engine
│   ├── TMS320C64x <- TMS320C64x engine
│   ├── TriCore <- TriCore engine
│   └── WASM <- WASM engine
2013-11-27 13:29:12 +00:00
├── bindings <- all bindings are under this dir
2013-11-27 09:00:06 +00:00
│   ├── java <- Java bindings + test code
│   ├── ocaml <- Ocaml bindings + test code
2014-11-01 21:14:13 +00:00
│   └── python <- Python bindings + test code
2014-11-01 16:34:14 +00:00
├── contrib <- Code contributed by community to help Capstone integration
├── cstool <- Cstool
2014-11-01 16:34:14 +00:00
├── docs <- Documentation
2013-12-05 02:25:51 +00:00
├── include <- API headers in C language (*.h)
2014-11-01 16:34:14 +00:00
├── msvc <- Microsoft Visual Studio support (for Windows compile)
├── packages <- Packages for Linux/OSX/BSD.
2016-05-12 04:48:32 +00:00
├── windows <- Windows support (for Windows kernel driver compile)
2014-01-07 03:34:05 +00:00
├── suite <- Development test tools - for Capstone developers only
2013-12-05 02:25:51 +00:00
├── tests <- Test code (in C language)
2014-11-01 21:14:13 +00:00
└── xcode <- Xcode support (for MacOSX compile)
2013-11-27 06:33:13 +00:00
2014-11-01 16:34:14 +00:00
Follow instructions in COMPILE.TXT for how to compile and run test code.
2013-11-27 13:29:12 +00:00
2013-12-05 02:25:51 +00:00
Note: if you find some strange bugs, it is recommended to firstly clean
the code and try to recompile/reinstall again. This can be done with:
2013-11-27 06:33:13 +00:00
$ ./make.sh
$ sudo ./make.sh install
2013-11-27 06:33:13 +00:00
Then test Capstone with cstool, for example:
$ cstool x32 "90 91"
2014-11-01 16:34:14 +00:00
At the same time, for Java/Ocaml/Python bindings, be sure to always use
the bindings coming with the core to avoid potential incompatibility issue
2014-11-01 16:34:14 +00:00
with older versions.
See bindings/<language>/README for detail instructions on how to compile &
install the bindings.
2019-02-01 23:39:07 +00:00
Coding style
------------
- C code follows Linux kernel coding style, using tabs for indentation.
- Python code uses 4 spaces for indentation.
Architecture updater (auto-sync) - Updating ARM (#1949) * Add auto-sync updater. * Update Capstone core with auto-sync changes. * Update ARM via auto-sync. * Make changes to arch modules which are introduced by auto-sync. * Update tests for ARM. * Fix build warnings for make * Remove meson.build * Print shift amount in decimal * Patch non LLVM register alias. * Change type of immediate operand to unsiged (due to: #771) * Replace all occurances of a register with its alias. * Fix printing of signed imms * Print rotate amount in decimal * CHange imm type to int64_t to match LLVM imm type. * Fix search for register names, by completing string first. * Print ModImm operands always in decimal * Use number format of previous capstone version. * Correct implicit writes and update_flags according to SBit. * Add missing test for RegImmShift * Reverse incorrect comparision. * Set shift information for move instructions. * Set mem access for all memory operands * Set subtracted flag if offset is negative. * Add flag for post-index memory operands. * Add detail op for BX_RET and MOVPCLR * Use instruction post_index operand. * Add VPOP and VPUSH as unique CS IDs. * Add shifting info for MOVsr. * Add TODOs. * Add in LLVM hardcoded operands to detail. * Move detail editing from InstPrinter to Mapping * Formatting * Add removed check. * Add writeback register and constraints to RFEI instructions. * Translate shift immediate * Print negative immediates * Remove duplicate invalid entry * Add CS groups to instructions * Fix write attriutes of stores. * Add missing names of added instructions * Fix LLVM bug * Add more post_index flags * http -> https * Make generated functions static * Remove tab prefix for alias instructions. * Set ValidateMCOperand to NULL. * Fix AddrMode3Operand operands * Allow getting system and banked register name via API * Add writeback to STC/LDC instructions. * Fix (hopefully) last case where disp is negative and subtracted = true * Remove accidentially introduced regressions
2023-07-19 09:56:27 +00:00
Updating an Architecture
------------------------
The update tool for Capstone is called `auto-sync` and can be found in `suite/auto-sync`.
Not all architectures are supported yet.
Run `suite/auto-sync/Update-Arch.sh -h` to get a list of currently supported architectures.
The documentation how to update with `auto-sync` or refactor an architecture module
can be found in [docs/AutoSync.md](docs/AutoSync.md).
If a module does not support `auto-sync` yet, it is highly recommended to refactor it
instead of attempting to update it manually.
Refactoring will take less time and updates it during the procedure.
The one exception is `x86`. In LLVM we use several emitter backends to generate C code.
One of those LLVM backends (the `DecoderEmitter`) has two versions.
One for `x86` and another for all the other architectures.
Until now it was not worth it to refactoring this unique `x86` backend. So `x86` is not
supported currently.
2019-02-01 23:39:07 +00:00
Adding an architecture
----------------------
Architecture updater (auto-sync) - Updating ARM (#1949) * Add auto-sync updater. * Update Capstone core with auto-sync changes. * Update ARM via auto-sync. * Make changes to arch modules which are introduced by auto-sync. * Update tests for ARM. * Fix build warnings for make * Remove meson.build * Print shift amount in decimal * Patch non LLVM register alias. * Change type of immediate operand to unsiged (due to: #771) * Replace all occurances of a register with its alias. * Fix printing of signed imms * Print rotate amount in decimal * CHange imm type to int64_t to match LLVM imm type. * Fix search for register names, by completing string first. * Print ModImm operands always in decimal * Use number format of previous capstone version. * Correct implicit writes and update_flags according to SBit. * Add missing test for RegImmShift * Reverse incorrect comparision. * Set shift information for move instructions. * Set mem access for all memory operands * Set subtracted flag if offset is negative. * Add flag for post-index memory operands. * Add detail op for BX_RET and MOVPCLR * Use instruction post_index operand. * Add VPOP and VPUSH as unique CS IDs. * Add shifting info for MOVsr. * Add TODOs. * Add in LLVM hardcoded operands to detail. * Move detail editing from InstPrinter to Mapping * Formatting * Add removed check. * Add writeback register and constraints to RFEI instructions. * Translate shift immediate * Print negative immediates * Remove duplicate invalid entry * Add CS groups to instructions * Fix write attriutes of stores. * Add missing names of added instructions * Fix LLVM bug * Add more post_index flags * http -> https * Make generated functions static * Remove tab prefix for alias instructions. * Set ValidateMCOperand to NULL. * Fix AddrMode3Operand operands * Allow getting system and banked register name via API * Add writeback to STC/LDC instructions. * Fix (hopefully) last case where disp is negative and subtracted = true * Remove accidentially introduced regressions
2023-07-19 09:56:27 +00:00
If your architecture is supported in LLVM or one of its forks, you can use `auto-sync` to
add the new module.
<!-- TODO: Move this info to the auto-sync docs -->
Obviously, you first need to write all the logic and put it in a new directory arch/newarch
Then, you have to modify other files.
(You can look for one architecture such as EVM in these files to get what you need to do)
2019-02-01 23:39:07 +00:00
Integrate:
- cs.c
- cstool/cstool.c
2019-02-01 23:39:07 +00:00
- cstool/cstool_newarch.c: print the architecture specific details
- include/capstone/capstone.h
2019-02-01 23:39:07 +00:00
- include/capstone/newarch.h: create this file to export all specifics about the new architecture
2019-02-01 23:39:07 +00:00
Compile:
- CMakeLists.txt
- Makefile
- config.mk
2019-02-01 23:39:07 +00:00
Tests:
- tests/Makefile
- tests/test_basic.c
- tests/test_detail.c
- tests/test_iter.c
- tests/test_newarch.c
- suite/fuzz/platform.c: add the architecture and its modes to the list of fuzzed platforms
2019-02-18 16:10:36 +00:00
- suite/capstone_get_setup.c
- suite/MC/newarch/mode.mc: samples
- suite/test_corpus.py: correspondence between architecture and mode as text and architecture number for fuzzing
2019-02-01 23:39:07 +00:00
Bindings:
- bindings/Makefile
2019-02-01 23:39:07 +00:00
- bindings/const_generator.py: add the header file and the architecture
- bindings/python/Makefile
- bindings/python/capstone/__init__.py
2019-02-01 23:39:07 +00:00
- bindings/python/capstone/newarch.py: define the python structures
- bindings/python/capstone/newarch_const.py: generate this file
- bindings/python/test_newarch.py: create a basic decoding test
- bindings/python/test_all.py
2019-02-01 23:39:07 +00:00
Docs:
- README.md
- HACK.txt
2019-02-01 23:39:07 +00:00
- CREDITS.txt: add your name