mm/tools/decomp-permuter
Tharo 7743e5a2c4
Overhaul the build system (#234)
* wip

* fix

* add disassembler

* Disasm builds OK

* Variable addends

* More wip

* Rodata migration implemented

* Cleanup old tools

* Try fix submodule -> subrepo merge

* git subrepo pull --force --remote=https://github.com/zeldaret/ZAPD.git tools/ZAPD

subrepo:
  subdir:   "tools/ZAPD"
  merged:   "602e609"
upstream:
  origin:   "https://github.com/zeldaret/ZAPD.git"
  branch:   "master"
  commit:   "602e609"
git-subrepo:
  version:  "0.4.3"
  origin:   "https://github.com/ingydotnet/git-subrepo"
  commit:   "2f68596"

* Builds again but assets are totally broken

* git subrepo pull --force tools/asm-processor

subrepo:
  subdir:   "tools/asm-processor"
  merged:   "1ffdb08a"
upstream:
  origin:   "https://github.com/simonlindholm/asm-processor.git"
  branch:   "master"
  commit:   "1ffdb08a"
git-subrepo:
  version:  "0.4.3"
  origin:   "https://github.com/ingydotnet/git-subrepo"
  commit:   "2f68596"

* More cleanup, move functions.txt and variables.txt to tools/disasm and rm tables

* rm z64compress in preparation for subrepo

* git subrepo clone (merge) https://github.com/z64me/z64compress.git tools/z64compress

subrepo:
  subdir:   "tools/z64compress"
  merged:   "eb11085c"
upstream:
  origin:   "https://github.com/z64me/z64compress.git"
  branch:   "main"
  commit:   "eb11085c"
git-subrepo:
  version:  "0.4.3"
  origin:   "https://github.com/ingydotnet/git-subrepo"
  commit:   "2f68596"

* Fix asset extraction

* Fix diff-init make rule

* Split code bss

* Split assumed linker bug padding from assembly files

* add filelists for mm.us.rev1

* Maybe working, but I'm not sure

* add overlays to spec

* Add rodata to actos

* Everything compiles

* Make a lot of C files for code

* Add almost every file in code to spec

* whoops

* 3 code files left

* add scenes to spec

* More progress on progress.py

* Fix skelanime in spec

* audio files!

* Fix merge issues

* Fix some C files in code

* Fix remaining code files

* Use existing O1 C files in spec

* reorder boot order in spec

* update spec

* fault.c

* Convert relocs on completed actors, fixbaserom uses current rom name

* more boot files

* Add VT macros and script

* finish already existing boot files

* most of  libultra

* fix 64bits libultra files

* Use C files for libultra, wrap some functions in NON_MATCHING

* Remove duplicate of OS_CLOCK_RATE from fault.c

* C files for fbdemos

* delete dumb files

* bootstrap C files, still need to add them to the spec

* update fixbaserom

* boot OK?

* I forgot to commit the spec

* C for gamestates

* C for kaleido

* Change all includes to ""

* copy actor sizes script from oot

* I forgot to delete those files

* Basic C files for effects

* Add effects initvars names

* Remove mislabelled boot functions from header/txt

* Begin porting bootstrap_fx, some sizes

* Fix <>

* Fix enum

* Fix diff.py

* fix libultra stuff

* update regconvert

* update setup warnings

* add some missing ;

* Fix some makefile stuff and other fixes on some non_matching functions

* add executable flag in extract_baserom and fixbaserom

* fix relative path

* copy assist from oot

* fix map path

* another assist path fix

* Delete C files for handwritten files

* add code_801A51F0 to spec

* add gfxbuffers to spec

* Move rodata to top of each file when possible

* UNK_TYPEs for func_801A51F0

* Remove kaleido rodata from spec

* Update spec and undefined_syms for recent merge

* GCC warnings and fix errors in nonmatchings,

* round percentage numbers

* progress script: format changes

* progress: error on non-existing files

* fix warning in z_scene_table

* Match 2 nonmatchings in z_actor

* Warnings in lightswitch and invadepoh

* Fix warning in z_actor_dlftbls

* I though I fixed this one

* whoops

* Comment out CC_CHECK

* Removed redundant ultra64.h includes

* Update asm_processor, sorted boot_O1 into other folders, completed the fbdemo bootstrap, cleaned up undefined_syms

* Completed gamestates bootstrap

* Split kaleido_scope

* Remove section.h and segment.h, move keep object externs to a common location in variables.h

* Completed effects bootstrap

* Segmented address externs for effects, fbdemos, gamestates and kaleido

* Move actor data externs out of the if 0

* Segmented address externs for actors

* Prepare actionfunc detection

* fix script, how did it even work before

* Fix actionfunc script again, re-introduce some more intermediate prints to the disassembler

* Automated actionFunc detection in actors

* Segmented addresses from player .text

* rm old segment addrs script and fix build

* Move sizes folder to tools

* Make build.py executable

* New Jenkinsfile Prayge

* Remove numpy dependencies

* Add warnings_disasm_current.txt

* my bad

* Update spec and undefined_syms

* Add z_eff_ss_hahen to pametfrog

* git subrepo pull (merge) --force tools/z64compress

subrepo:
  subdir:   "tools/z64compress"
  merged:   "163ca2af"
upstream:
  origin:   "https://github.com/z64me/z64compress.git"
  branch:   "main"
  commit:   "163ca2af"
git-subrepo:
  version:  "0.4.3"
  origin:   "https://github.com/ingydotnet/git-subrepo"
  commit:   "2f68596"

* Make z64compress print to stdout

* sneeky commit to update warnings tooling

* test

* Another test

* Mark fixing overlay reloc generating as a TODO

* Update warnings stuff

* Communicate the return code from running z64compress back to the Makefile through the wrapper

* Run formatter, remove extra commented copy of function

* Re-fix some includes

* Convert atan to hex to conform to decided style

* Some tidying up, remove c for fp and the other two handwritten code files

* BSS in z_collision_check & z_scene_proc

* add static back in

* Fix timerintr bss, add file to spec, some cleanup

* Remove externs

* Newline

* Readd enums

* Typo

* Colours

* Comments for hitmark enum values

Co-authored-by: EllipticEllipsis <73679967+EllipticEllipsis@users.noreply.github.com>

* Improvements and suggestions

* Organize and remove unused imports and use env for python3 scripts, delete unused overlay.py

Co-authored-by: angie <angheloalf95@gmail.com>
Co-authored-by: Elliptic Ellipsis <elliptic.ellipsis@gmail.com>
Co-authored-by: engineer124 <engineer124engineer124@gmail.com>
Co-authored-by: EllipticEllipsis <73679967+EllipticEllipsis@users.noreply.github.com>
2021-08-03 23:21:31 -04:00
..
.github/workflows Convert every submodule into subrepo (#170) 2021-06-07 18:31:56 -04:00
src Overhaul the build system (#234) 2021-08-03 23:21:31 -04:00
stubs/pycparser Overhaul the build system (#234) 2021-08-03 23:21:31 -04:00
test Overhaul the build system (#234) 2021-08-03 23:21:31 -04:00
.gitignore Overhaul the build system (#234) 2021-08-03 23:21:31 -04:00
.gitrepo Convert every submodule into subrepo (#170) 2021-06-07 18:31:56 -04:00
.pre-commit-config.yaml Convert every submodule into subrepo (#170) 2021-06-07 18:31:56 -04:00
compile_example.sh Convert every submodule into subrepo (#170) 2021-06-07 18:31:56 -04:00
diff.sh Convert every submodule into subrepo (#170) 2021-06-07 18:31:56 -04:00
import.py Overhaul the build system (#234) 2021-08-03 23:21:31 -04:00
mypy.ini Overhaul the build system (#234) 2021-08-03 23:21:31 -04:00
pah.py Convert every submodule into subrepo (#170) 2021-06-07 18:31:56 -04:00
permuter_settings_example.toml Overhaul the build system (#234) 2021-08-03 23:21:31 -04:00
permuter.py Overhaul the build system (#234) 2021-08-03 23:21:31 -04:00
README.md Overhaul the build system (#234) 2021-08-03 23:21:31 -04:00
run-tests.sh Overhaul the build system (#234) 2021-08-03 23:21:31 -04:00
sort_cands.sh Convert every submodule into subrepo (#170) 2021-06-07 18:31:56 -04:00
strip_other_fns.py Overhaul the build system (#234) 2021-08-03 23:21:31 -04:00
test.py Overhaul the build system (#234) 2021-08-03 23:21:31 -04:00
USAGE.md Overhaul the build system (#234) 2021-08-03 23:21:31 -04:00

Decomp permuter

Automatically permutes C files to better match a target binary. The permuter has two modes of operation:

  • Random: purely at random, introduce temporary variables for values, change types, put statements on the same line...
  • Manual: test all combinations of user-specified variations, using macros like PERM_TERNARY(a = , b, c, d) to try both a = b ? c : d and if (b) a = c; else a = d;.

The modes can also be combined, by using the PERM_RANDOMIZE macro.

The main target for the tool is MIPS code compiled by old compilers (IDO, possibly GCC). Getting it to work on other architectures shouldn't be too hard, however. https://github.com/laqieer/decomp-permuter-arm has an ARM port.

Usage

./permuter.py directory/ runs the permuter; see below for the meaning of the directory. Pass -h to see possible flags.

You'll first need to install a couple of prerequisites: python3 -m pip install attrs pycparser pynacl toml (also dataclasses if on Python 3.6 or below)

The permuter expects as input one or more directory containing:

  • a .c file with a single function,
  • a .o file to match,
  • a .sh file that compiles the .c file.

For projects with a properly configured makefile, you should be able to set these up by running

./import.py <path/to/file.c> <path/to/file.s>

where file.c contains the function to be permuted, and file.s is its assembly in a self-contained file. Otherwise, see USAGE.md for more details.

The .c file may be modified with any of the following macros which affect manual permutation:

  • PERM_GENERAL(a, b, ...) expands to any of a, b, ...
  • PERM_TYPECAST(a, b, ...) expands to any of (a), (b), ... (empty argument for no cast at all)
  • PERM_TERNARY(prefix, a, b, c) expands to either prefix a ? b : c or if (a) prefix b; else prefix c;.
  • PERM_VAR(a, b) sets the meta-variable a to b, PERM_VAR(a) expands to the meta-variable a.
  • PERM_RANDOMIZE(code) expands to code, but allows randomization within that region.
  • PERM_LINESWAP(lines) expands to a permutation of the ordered set of non-whitespace lines (split by \n).
  • PERM_CONDNEZ(cond) expands to either cond or (cond) != 0.
  • PERM_INT(lo, hi) expands to an integer between lo and hi (which must be constants).

Arguments are split by a commas, exluding commas inside parenthesis. (,) is a special escape sequence that resolves to ,.

Nested macros are allowed, so e.g.

PERM_VAR(delayed, )
PERM_GENERAL(stmt;, PERM_VAR(delayed, stmt;))
...
PERM_VAR(delayed)

is a valid pattern for emitting a statement either at one point or later.

FAQ

What do the scores mean? The scores are computed by taking diffs of objdump'd .o files, and giving different penalties for lines that are the same/use the same instruction/are reordered/don't match at all. Stack positions are ignored. For more details, see scorer.py. It's far from a perfect system, and should probably be tweaked to look at e.g. the register diff graph.

What sort of non-matchings are the permuter good at? It's generally best towards the end, when mostly regalloc changes remain. If there are reorderings or functional changes, it's often easy to resolve those by hand, and neither the scorer nor the randomizer tends to play well with them.

Should I use this instead of trying to match code by hand? Well, the manual PERM macros might speed you up if you manage to fit the permuter into your workflow. The random mode is however much more of a last ditch sort of thing. It often finds nonsensical permutations that happen to match regalloc very well by accident. Still, it's often useful in pointing out which parts of the code need to be changed to get the code nearer to matching.

Helping out

There's tons of room for helping out with the permuter! Many more randomization passes could be added, the scoring function is far from optimal, the permuter could be made easier to use, etc. etc. The GitHub Issues list has some ideas.

Ideally, mypy permuter.py and ./run-tests.sh should succeed with no errors, and files formatted with black. To setup a pre-commit hook for black, run:

pip install pre-commit black
pre-commit install

PRs that skip this are still welcome, however.