Adding Git information to tutorial (#862)

* Adding an intro to git with helpful commands for someone unfamiliar
with git.

Co-authored by: EllipticEllipsis <elliptic.ellipsis@gmail.com>

* Adding an intro to git with helpful commands for someone unfamiliar
with git.

Co-authored-by: EllipticEllipsis <elliptic.ellipsis@gmail.com>

* ovl_Obj_Y2lift decompiled (#856)

* ovl_Obj_Y2lift decompiled

* format

* pr review fixes

* clean up

Co-authored-by: SonicDcer <noreply@github.com>

* Formating files and moving contributing.md
Also fixes links.

* Adding an intro to git with helpful commands for someone unfamiliar
with git.

formating files too

Co-authored-by: EllipticEllipsis <elliptic.ellipsis@gmail.com>

* pr fixes

Co-authored-by: EllipticEllipsis <elliptic.ellipsis@gmail.com>
Co-authored-by: Alejandro Asenjo <96613413+sonicdcer@users.noreply.github.com>
Co-authored-by: SonicDcer <noreply@github.com>
This commit is contained in:
Parker Burnett 2022-07-11 17:27:49 -07:00 committed by GitHub
parent 54957f8735
commit 3503163a64
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
21 changed files with 703 additions and 120 deletions

View File

@ -34,10 +34,9 @@ It currently builds the following ROM:
Please refer to the following for more information:
- [Website](https://zelda64.dev/)
- [Discord](https://discord.zelda64.dev/)
- [How to Contribute](CONTRIBUTING.md)
* [Website](https://zelda64.dev/)
* [Discord](https://discord.zelda64.dev/)
* [How to Contribute](docs/CONTRIBUTING.md)
## Installation
@ -47,12 +46,10 @@ For Windows 10, install WSL and a distribution by following this
[Windows Subsystem for Linux Installation Guide](https://docs.microsoft.com/en-us/windows/wsl/install-win10).
We recommend using Debian or Ubuntu 20.04 Linux distributions.
### MacOS
Preparation is covered in a [separate document](docs/BUILDING_MACOS.md).
### Linux (Native or under WSL / VM)
#### 1. Install build dependencies
@ -129,7 +126,7 @@ This means that something is wrong with the ROM's contents. Either the baserom f
Running `make init` will also make the `./expected` directory and copy all of the files there, which will be useful when running the diff script. The diff script is useful in decompiling functions and can be ran with this command: `./tools/asm-differ/diff.py -wmo3 <insert_function_here>`
**Note**: to speed up the build, you can pass `-jN` to `make setup` and `make`, where N is the number of threads to use in the build, e.g. `make -j4`. The generally-accepted wisdom is to use the number of virtual cores your computer has, which is the output of `nproc` (which should be installed as part of `coreutils`).
**Note**: to speed up the build, you can pass `-jN` to `make setup` and `make`, where N is the number of threads to use in the build, e.g. `make -j4`. The generally-accepted wisdom is to use the number of virtual cores your computer has, which is the output of `nproc` (which should be installed as part of `coreutils`).
The disadvantage that the ordering of the terminal output is scrambled, so for debugging it is best to stick to one thread (i.e. not pass `-jN`).
(`-j` also exists, which uses unlimited jobs, but is generally slower.)

View File

@ -2,7 +2,6 @@
**N.B. C++17 is required to build the asset processing program that we use (ZAPD), so check your OS version can support this before proceeding**
## Dependencies
For macOS, use Homebrew to install the following dependencies:
@ -23,59 +22,66 @@ brew install coreutils make python3 libpng bash clang-format@11
(The repository expects Homebrew-installed programs to be either linked correctly in `$PATH` etc. or in their default locations.)
## Building mips-linux-binutils
The following instructions are written for MacOS users but should apply to any Unix-like system, with maybe some modifications at the end regarding the bash_profile.
Create destination dir for binutils
```bash
sudo mkdir -p /opt/cross
```
Create and enter local working dir
```bash
mkdir ~/binutils-tmp
cd ~/binutils-tmp
```
Get and extract binutils source
```bash
wget https://ftp.gnu.org/gnu/binutils/binutils-2.35.tar.bz2
tar xjf binutils-2.35.tar.bz2
```
(You may find this command does not work: if so, just access the URL in a browser and save it to `~/binutils-tmp`.)
Create and enter a build directory
```bash
mkdir build-binutils
cd build-binutils
```
Configure the build
```bash
../binutils-2.35/configure --target=mips-linux-gnu --prefix=/opt/cross --disable-gprof --disable-gdb --disable-werror
```
Make and install binutils
```bash
make -j
sudo make install
```
Edit your `~/.bash_profile`/`~/.zsh_profile` (or whichever shell you use) to add the new binutils binaries to the system PATH
```bash
echo "export PATH=$PATH:/opt/cross/bin" >> ~/.bash_profile
```
Reload ~/.bash_profile (or just launch a new terminal tab)
```bash
source ~/.bash_profile
```
If this worked, you can now delete the temporary directory `~/binutils-tmp`.
## Final note
Apple's version of `make` is very out-of-date, so you should use the brew-installed `gmake` in place of `make` in this repo from now on.

View File

@ -1,7 +1,6 @@
# Contributing to the Majora's Mask Decompilation Project
Thanks for helping us reverse engineer *The Legend of Zelda: Majora's Mask* for the N64!
Thanks for helping us reverse engineer *The Legend of Zelda: Majora's Mask* for the N64!
All contributions are welcome. This is a group effort, and even small contributions can make a difference. Some tasks also don't require much knowledge to get started.
This document is meant to be a set of tips and guidelines for contributing to the project.
@ -9,19 +8,17 @@ For general information about the project, see [our readme](https://github.com/z
Most discussions happen on our [Discord Server](https://discord.zelda64.dev) where you are welcome to ask if you need help getting started, or if you have any questions regarding this project and other decompilation projects.
## Useful Links
- [Installation guide](https://github.com/zeldaret/mm/blob/master/README.md#installation) - Instructions for getting this repository set up and built on your machine.
- [Style Guide](docs/STYLE.md) - Description of the project style that we ask contributors to adhere to.
- [Code Review Guidelines](docs/REVIEWING.md) - These are the guidelines that reviewers will be using when reviewing your code. Good to be familiar with these before submitting your code.
- [Style Guide](STYLE.md) - Description of the project style that we ask contributors to adhere to.
- [Code Review Guidelines](REVIEWING.md) - These are the guidelines that reviewers will be using when reviewing your code. Good to be familiar with these before submitting your code.
- [Zelda 64 Reverse Engineering Website](https://zelda64.dev/mm) - Our homepage, with FAQ and progress graph :chart_with_upwards_trend:.
- [MM decomp tutorial](docs/tutorial/contents.md) Detailed tutorial for learning in general how decomp works and how to decompile a small, simple file.
- [MM decomp tutorial](tutorial/contents.md) Detailed tutorial for learning in general how decomp works and how to decompile a small, simple file.
- [Introduction to OOT decomp](https://github.com/zeldaret/oot/blob/master/docs/tutorial/contents.md) - The tutorial the MM one was based on. For OOT, but largely applicable to MM as well. Covers slightly different topics, including how to get your data OK with `vbindiff`.
- The `#resources` channel on the Discord contains many more links on specific details of decompiling IDO MIPS code.
## Getting Started
### What should I know to take part?
@ -36,7 +33,6 @@ The [OoT Decompilation Project](https://github.com/zeldaret/oot) is farther alon
This project only uses *publicly available code*.
**N.B.** Anyone who wishes to contribute to the OOT or MM projects **must not have accessed leaked source code at any point in time** for Nintendo 64 SDK, iQue player SDK, libultra, Ocarina of Time, Majora's Mask, Animal Crossing/Animal Forest, or any other game that shares the same game engine or significant portions of code to a Zelda 64 game or any other console similar to the Nintendo 64.
### Environment Setup
@ -46,27 +42,27 @@ You should be able to build a matching ROM before you start making any changes.
### First Contribution
Usually, the best place to get started is to decompile an actor overlay.
Usually, the best place to get started is to decompile an actor overlay.
An *actor* is any thing in the game that moves or performs actions or interactions. This includes things like Link, enemies, NPCs, doors, pots, etc. Actors are good for a first file because they are generally small, self-contained systems.
We recommend that you [join the Discord](https://discord.zelda64.dev/) to say hello and get suggestions on where to start on the `#mm-decomp` channel.
We track who is working on what on some Google Sheets available in the Discord. Once you've decided on or been recommended a good first file, mark it as Reserved.
The workflow is:
- Reserve a file,
- decompile it,
- submit a PR,
The workflow is:
- Reserve a file,
- decompile it,
- submit a PR,
- repeat while addressing review comments.
The expectation is that one reservation goes to one file which ends up in a one file PR, although naturally some files are more sensibly worked on as a group, for example two actors that work together. This also does not apply to large asset files like `gameplay_keep`: you can just reserve the parts that are used in your files.
If possible, we expect reserved files to be completed. If you find you cannot complete a file, because it is intractable for one reason or another, or real-life circumstances get in the way, please talk to one of the leads in Discord; we may find someone else interested in helping you finish, or who is happy to take over the file from you completely. If you unreserve a file on which you have useful progress, please leave a link to your branch in the Notes column on the Google Sheet that the next person who works on the file can use.
## Style Guide & Conventions
See the [Style Guide](docs/STYLE.md).
See the [Style Guide](STYLE.md).
## `NON_MATCHING` and `NON_EQUIVALENT`
@ -84,7 +80,8 @@ void CollisionCheck_SpawnWaterDroplets(PlayState* play, Vec3f* v);
```
Before PRing with a `NON_MATCHING`, you can try
- using the [decomp-permuter](tools/decomp-permuter) to find a closer match,
- using the [decomp-permuter](https://github.com/simonlindholm/decomp-permuter) to find a closer match,
- Asking in `#mm-decomp-help` in Discord; the easiest way to allow other people to play around with the function you are stuck on is to make a scratch on [decomp.me](http://decomp.me).
`NON_EQUIVALENT` can be used with the same syntax as `NON_MATCHING`, but it is used to mark sections of code which do not match *and* do not have the same behavior as the original code.
@ -104,8 +101,7 @@ Documenting is more than just adding comments. Documenting also includes:
Overlays are not required to be documented at this time, but files from `code/` and `boot/` should be documented. When documentation on a file has been started it should be as complete as reasonable.
See the [Style Guide](docs/STYLE.md) for more details on documentation style.
See the [Style Guide](STYLE.md) for more details on documentation style.
## Pull Requests (PRs)
@ -125,13 +121,15 @@ Feel free to reach out on the Discord if you have any questions about these step
### Pull Request Process
After opening a PR, the Jenkins agent will test formatting, the contents of the spec, build the rom and check for warnings.
If there is an error, double-check that you can successfully
If there is an error, double-check that you can successfully
```bash
make disasm
./extract_assets.py -f
make clean
make
```
locally. If the build is `OK`, the next thing to check is that all added/modified files were `git add`-ed to your commit. The final check before posting on Discord for help is that there are no new warnings added to the code causing Jenkins to fail. You can check this by running: `tools/warnings_count/check_new_warnings.sh`.
Each PR needs a review from two reviewers, at least one a project lead, and final approval from Kenix.

View File

@ -1,6 +1,6 @@
# Reviewing Pull Requests to the Majora's Mask Decompilation Project
Thanks for helping us reverse engineer *The Legend of Zelda: Majora's Mask*!
Thanks for helping us reverse engineer *The Legend of Zelda: Majora's Mask*!
We encourage all contributors to participate in code review: this is your codebase too!
Every review submitted helps us keep code quality high and code merged in more quickly.
@ -11,13 +11,11 @@ Most discussions happen on our [Discord Server](https://discord.zelda64.dev) whe
Other links are available in the [CONTRIBUTING.md](CONTRIBUTING.md)
## Getting Started
### What should I know to take part in the review process?
You should first famiiarise yourself with our [Contributing guide](CONTRIBUTING.md) and [Style guide](docs/STYLE.md). It is also recommended that you have already successfully submitted a merged pull request to understand how the process works before submitting a review.
You should first famiiarise yourself with our [Contributing guide](CONTRIBUTING.md) and [Style guide](STYLE.md). It is also recommended that you have already successfully submitted a merged pull request to understand how the process works before submitting a review.
## Pull Requests (PRs)
@ -29,6 +27,7 @@ You should first famiiarise yourself with our [Contributing guide](CONTRIBUTING.
- If someone does not address your comments and expresses that a different way is better than yours, look for feedback from other contributors (we encourage discussing this sort of thing in Discord, since long GitHub conversations get hard to read). The project leads will have final say in these situations. All decisions are generally guided by a consensus of contributors.
### Reviewer Checklist
- [ ] Jenkins build is successful.
- [ ] `make` builds a matching ROM.
- [ ] `format.sh` was run.

View File

@ -44,26 +44,31 @@ A lot of formatting is done by clang-format, such as
There are various other conventions that it does not catch, though:
- Blank line between declarations and code:
```c
s32 var;
func();
```
- combine declarations and definitions if possible:
```c
s32 var = 0;
func();
```
instead of
```c
s32 var;
var = 0;
func();
```
- blank lines between switch cases if they're long (use your judgement).
- blank lines between switch cases if they're long (use your judgement).
## Numbers
@ -98,6 +103,7 @@ Floats usually need an `f` on the end to match, or IDO will use doubles. Our flo
- When conditions are `&&`d or `||`d together, use brackets around each that includes an arithmetic comparison or bitwise operator (i.e. not `!var` or `func()`, but ones with `==` or `&` etc.)
- Flag checks or functions that return booleans do not need the `== 0`/`!= 0`.
- Prefer `if-else` over `if { return; }`, i.e.
```c
if (cond) {
foo();
@ -105,7 +111,9 @@ Floats usually need an `f` on the end to match, or IDO will use doubles. Our flo
bar();
}
```
over
```c
if (cond) {
foo();
@ -113,6 +121,7 @@ Floats usually need an `f` on the end to match, or IDO will use doubles. Our flo
}
bar();
```
**Exception**: After `Actor_MarkForDeath` or sometimes setting the action function, if it makes sense to do so (this expresses the finality a bit better).
## Macros and enums
@ -120,8 +129,8 @@ Floats usually need an `f` on the end to match, or IDO will use doubles. Our flo
Become familiar with the various defines and enums we have available. There are too many to list all of them here, but the following are common:
- Those in `macros.h`
- `ABS`, `ABS_ALT`,
- `CLAMP` and friends,
- `ABS`, `ABS_ALT`,
- `CLAMP` and friends,
- `BINANG_*`, which are used for angles, especially when there's a lot of `s16` casts around
- `MTXMODE` for many of the `sys_matrix` functions
- CollisionCheck flags: `AT_ON` and so on. Pick the appropriate one for the collider type.
@ -149,6 +158,7 @@ void EnFirefly_Update(Actor* thisx, PlayState* play2) {
```
In other places the cast is actually not explictly needed, but a stack `pad` variable is still needed. For this there should just be a stack variable called `pad` of type `s32` before the actor `THIS` cast. For example in `z_bg_goron_oyu`
```c
void BgGoronOyu_Init(Actor* thisx, PlayState* play) {
s32 pad;
@ -158,10 +168,10 @@ void BgGoronOyu_Init(Actor* thisx, PlayState* play) {
In general, pads should be `s32`, or `s16`/`s8` if required.
## Documentation and Comments
Documentation includes:
- Naming functions
- Naming struct variables
- Naming data
@ -175,13 +185,16 @@ Documentation includes:
If you are not sure what something does, it is better to leave it unnamed than name it wrongly. It is fine to make a note of something you are not sure about when PRing, it means the reviewers will pay special attention to it.
We use comments for:
- Top of file: a short description of the system. For actors there is already a brief description of our current understanding, but feel free to add to it.
- For function descriptions, we use multiline comments,
- For function descriptions, we use multiline comments,
```c
/**
* Describe what the function does
*/
```
These are *optional*: if you think the code is clear enough, you do not need to put a comment. You can use Doxygen formatting if you think it adds something, but it is also not required.
- If something in a function is strange, or unintuitive, do leave a comment explaining what's going on. We use `//` for this.
- We also use `//` for temporary comments above a function. Feel free to use `TODO:` in these if appropriate.
@ -199,7 +212,7 @@ All functions should go in the main C file in the same order as the assembly (th
- If in doubt, leave all the data at the top of the file. Reviewers will decide for you.
- Data must go in the same order as in the assembly files, but is only constrained by other data, not functions or rodata.
- Some data has to be inline static to match. Generally it's better to not use `static` on data outside funtions until the file is matching, since `static` data is left out of the mapfile and this makes debugging harder.
- Some data has to be inline static to match. Generally it's better to not use `static` on data outside funtions until the file is matching, since `static` data is left out of the mapfile and this makes debugging harder.
- *This is even more true of bss, where we have trouble with IDO unpredictably reordering it in certain files.*
- For small arrays or simple data that is used in only one function, we usually inline it, if it fits in the ordering.
- Generally data that is only used by the draw functions is put down near them: this is one of the few consistencies in ordering of actors' functions.
@ -207,15 +220,19 @@ All functions should go in the main C file in the same order as the assembly (th
### Enums and defines
- Actors that bitpack params should have macros made for each access or write that is made. `z_en_dg.h` has an undocumented example,
```c
#define ENDG_GET_FC00(thisx) (((thisx)->params & 0xFC00) >> 0xA)
#define ENDG_GET_3E0(thisx) (((thisx)->params & 0x3E0) >> 5)
```
while `z_en_firefly.h` has a documented one,
```c
#define KEESE_INVISIBLE (1 << 0xF)
#define KEESE_GET_MAIN_TYPE(thisx) ((thisx)->params & 0x7FFF)
```
- In a similar manner, actors that use `home.rot.(x|y|z)` like params should also macros made for accesses and writes. (See, e.g. `z_obj_bean.h`.)
- Stuff that only the actor itself will use goes in the C file unless needed in the header.
- Anything actor-specific that might be used by another file goes in the header, in particular params access macros.

View File

@ -286,6 +286,6 @@ We use a [Google Sheet](https://docs.google.com/spreadsheets/d/1X83YCPRa532v-Zo0
- **Function size statistics**: Intended as a crude estimate of how hard a file will be. Beginners should look for small largest function size and total size; the columns give a rough estimate of the distribution of function sizes without getting unnecessarily statistically descriptive. As you become more experienced, you should work on larger files to leave the smaller ones for other beginners.
- **Description**: What the file is. It's helpful if you can fill this in if you know! They should be synchronised with the short top-of-file descriptions.
- **Status**: (Free)/Reserved/PR/Merged. To be kept up-to-date by the reserver.
- **Reserved**: To reserve a file, put your Discord name in the "Reserved" column. It is common courtesy to not work on a file that is being worked on by another contributor, so ensure the "Reserved" column is blank before working on a file. If it is not, you can ask the reserver(s) if they want to release it or collaborate on it, but don't expect them to agree. More information on what is expected when you reserve a file is available in the [CONTRIBUTING.md](../CONTRIBUTING.md).
- **Reserved**: To reserve a file, put your Discord name in the "Reserved" column. It is common courtesy to not work on a file that is being worked on by another contributor, so ensure the "Reserved" column is blank before working on a file. If it is not, you can ask the reserver(s) if they want to release it or collaborate on it, but don't expect them to agree. More information on what is expected when you reserve a file is available in the [CONTRIBUTING.md](CONTRIBUTING.md).
- **Interested**: If you would like to work on a file, but don't want to reserve it, or would be interested in collaboration, etc. You should talk to any Interested people if you want to work on the file.
- **Notes**: Any other useful information: partial progress by someone unable to finish the file, other files it works with, etc.

View File

@ -220,8 +220,7 @@ which is long, messy, and contains some rather nasty-looking control flow, inclu
}
```
If you read the OoT tutorial, you'll know these nested negated ifs all using the same variable are a good indicator that there's a switch. The problem is working out how to write it.
If you read the OoT tutorial, you'll know these nested negated ifs all using the same variable are a good indicator that there's a switch. The problem is working out how to write it.
## Goto-only mode
@ -303,6 +302,7 @@ block_17:
```
which in many ways looks worse: you can see why the use of gotos in code is strongly discouraged. However, if you throw this in `diff.py`, you'll find it's rather closer than you'd have thought. Goto-only mode has the advantages that
- code is always in the right order: mips2c has not had to reorder anything to get the ifs to work out
- it is often possible to get quite close with gotos, then start removing them, checking the matching status at each point. This is usually easier than trying to puzzle out the way it's trying to jump out of an `if ( || )` or similar.
- if you're trying to keep track of where you are in the code, the gotos mean that it is closer to the assembly in the first place.
@ -404,7 +404,6 @@ block_17:
We can't apply this rule any more, so we need to move on to the next: `block_17` just contains a `return`. So we can replace it by `return` everywhere it appears.
```C
void func_809527F8(EnMs* this, PlayState* play) {
u8 temp_v0;
@ -486,6 +485,7 @@ Now let's start thinking about switches. A good indicator of a switch in goto-on
```
because
- there are multiple ifs that are simple numeric comparisons of the same argument
- the goto blocks are in the same order as the ifs
- there is one last goto at the end that triggers if none of the ifs does: this sounds an awful lot like a `default`!
@ -523,7 +523,8 @@ So let us rewrite the entire second half as a switch:
}
```
There's a couple of other obvious things here:
There's a couple of other obvious things here:
- the last `return` in `case 0` is unnecessary since there is no other code after the switch, so breaking is equivalent to the return`
- a common pattern everywhere, a sequence of ifs with returns as the last thing inside is the same as an if-else chain, so we can rewrite these as
@ -622,6 +623,7 @@ block_7:
```
Now, the top of the function also looks like a switch:
```C
temp_v0 = Message_GetState(&play->msgCtx);
if (temp_v0 == 4) {
@ -742,6 +744,6 @@ void func_809527F8(EnMs* this, PlayState* play) {
}
```
And this matches!
And this matches!
We will not document this now, although even with so few function named it seems pretty clear that it's to do with buying beans (and indeed, Magic Beans cost 10 Rupees and have Get Item ID `0x35`) You might like to try to match this function without using goto-only mode, to compare. It is also an interesting exercise to see what each elimination does to the diff: sometimes it will stray surprisingly far for a small change.

View File

@ -6,7 +6,6 @@ Open the C file and the H file with your actor's name from the appropriate direc
Each actor has associated to it a data file and one assembly file per function. During the process, we will transfer the contents of all or most of these into the main C file. VSCode's search feature usually makes it quite easy to find the appropriate files without troubling the directory tree.
## Anatomy of the C file
The actor file starts off looking like:
@ -102,7 +101,6 @@ It is currently divided into six sections as follows:
6. List of functions. Each `#pragma GLOBAL_ASM` is letting the compiler use the corresponding assembly file while we do not have decompiled C code for that function. The majority of the decompilation work is converting these functions into C that it looks like a human wrote.
## Header file
The header file looks like this at the moment:
@ -133,10 +131,10 @@ The struct currently contains a variable that is the `Actor` struct, which all a
The header file is also used to declare structs and other information about the actor that is needed by other files (e.g. by other actors): one can simply `#include` the header rather than `extern`ing it.
## Order of decompilation
The general rule for order of decompilation is
- Start with `Init`, because it usually contains the most information about the structure of the actor. You can also do `Destroy`, which is generally simpler than `Init`.
- Next, decompile any other functions from the actor you have found in `Init`. You generally start with the action functions, because they return nothing and all take the same arguments,
@ -158,20 +156,19 @@ The above is a rough ordering for the beginner. As you become more experienced,
Associated to each actor is a `.data` file, containing data that the actor uses. This ranges from spawn positions, to animation information, to even assets that we have to extract from the ROM. Since the structure of the data is very inconsistent between actors, automatic importing has been very limited, so the vast majority must be done manually.
There are two ways of transfering the data into an actor: we can either
- import it all naively as words (`s32`s), which will still allow it to compile, and sort out the actual types later, or
There are two ways of transfering the data into an actor: we can either
- import it all naively as words (`s32`s), which will still allow it to compile, and sort out the actual types later, or
- we can extern each piece of data as we come across it, and come back to it later when we have a better idea of what it is.
We will concentrate on the second here; the other is covered in [the document about data](data.md). Thankfully this means we essentially don't have to do anything to the data yet. Nevertheless, it is often quite helpful to copy over at least some of the data and leave it commented out for later replacement. *Data must go in the same order as in the data file, and data is "all or nothing": you cannot only import some of it*.
**WARNING** The way in which the data was extracted from the ROM means that there are sometimes "fake symbols" in the data, which have to be removed to avoid confusing the compiler. Thankfully it will turn out that this is not the case here.
(Sometimes it is useful to import the data in the middle of doing functions: you just have to choose an appropriate moment.)
Some actors also have a `.bss` file. This is just data that is initialised to 0, and can be imported immediately once you know what type it is, by declaring it without giving it a value. (bss is a significant problem for code files, but not *usually* for actors.)
## Init
The Init function sets up the various components of the actor when it is first loaded. It is hence usually very useful for finding out what is in the actor struct, and so we usually start with it. (Some people like starting with Destroy, which is usually shorter and simpler, but gives some basic information about the actor, but Init is probably best for beginners.)
@ -183,15 +180,19 @@ The first stage of decompilation is done by a program called mips_to_c, often re
The web version of mips2c can be found [here](https://simonsoftware.se/other/mips_to_c.py). This was [covered in the OoT tutorial](https://github.com/zeldaret/oot/blob/master/docs/tutorial/beginning_decomp.md). We shall instead use the repository. Clone [the mips_to_c repository](https://github.com/matt-kempster/mips_to_c) into a separate directory (we will assume on the same level as the `mm/` directory). Since it's Python, we don't have to do any compilation or anything in the mips_to_c directory.
Since the actor depends on the rest of the codebase, we can't expect to get much intelligible out of mips2c without giving it some context. We make this using a Python script in the `tools` directory called `m2ctx.py`, so run
```
$ ./tools/m2ctx.py <path_to_c_file>
```
from the main directory of the repository. In this case, the C file is `src/overlays/actors/ovl_En_Recepgirl/z_en_recepgirl.c`. This generates a file called `ctx.c` in the main directory of the repository.
To get mips_to_c to decompile a function, the bare minimum is to run
```
$ ../mips_to_c/mips_to_c.py <path_to_function_assembly_file>
```
(from the root directory of `mm`). We can tell mips2c to use the context file we just generated by adding `--context ctx.c`. If we have data, mips2c may be able to assist with that as well.
In this case, we want the assembly file for `EnRecepgirl_Init`. You can copy the path to the file in VSCode or similar, or just tab-complete it once you know the directory structure well enough: it turns out to be `asm/non_matchings/overlays/ovl_En_Recepgirl/EnRecepgirl_Init.s`.
@ -199,6 +200,7 @@ In this case, we want the assembly file for `EnRecepgirl_Init`. You can copy the
**N.B.** You want the file in `nonmatchings`! the files in the other directories in `asm/` are the *unsplit* asm, which can be used, but is less convenient (you would need to include the rodata, for example, and it will do the whole file at once. This is sometimes useful, but we'll go one function at a time today to keep things simple).
We shall also include the data file, which is located at `data/overlays/ovl_En_Recepgirl/ovl_En_Recepgirl.data.s`. Hence the whole command will be
```
$ ../mips_to_c/mips_to_c.py asm/non_matchings/overlays/ovl_En_Recepgirl/EnRecepgirl_Init.s data/ovl_En_Recepgirl/ovl_En_Recepgirl.data.s --context ctx.c
? func_80C10148(EnRecepgirl *); // extern
@ -233,6 +235,7 @@ void EnRecepgirl_Init(EnRecepgirl* this, PlayState* play) {
func_80C10148(this);
}
```
Comment out the `GLOBAL_ASM` line for `Init`, and paste all of this into the file just underneath it:
```C
@ -271,6 +274,7 @@ void EnRecepgirl_Init(Actor* thisx, PlayState* play) {
}
[...]
```
</details>
Typically for all but the simplest functions, there is a lot that needs fixing before we are anywhere near seeing how close we are to the original code. You will notice that mips2c creates a lot of temporary variables. Usually most of these will turn out to not be real, and we need to remove the right ones to get the code to match.
@ -364,7 +368,6 @@ extern s32 D_80C106C8;
**N.B.** As is covered in more detail in [the document about data](data.md), the data *must* be declared in the same order in C as it was in the data assembly file: notice that the order in this example is `En_Recepgirl_InitVars`, `D_80C106B0`, `D_80C106C0`, `D_80C106C8`, the same as in `data/ovl_En_Recepgirl/ovl_En_Recepgirl.data.s`.
In the next sections, we shall sort out the various initialisation functions that occur in Init. This actor contains several of the most common ones, but it does not have, for example, a collider. The process is similar to what we discuss below, or you can check the OoT tutorial.
<!-- ### Data and function prototypes
@ -372,7 +375,7 @@ In the next sections, we shall sort out the various initialisation functions tha
Let's first look at the block of stuff that mips2c has put above the function. This usually contains useful information, but often needs work to make it compile and be in the right place. -->
### Init chains
Almost always, one of the first items in `Init` is a function that looks like
```C
@ -381,10 +384,10 @@ Actor_ProcessInitChain(&this->actor, D_80C106C0);
which initialises common properties of actor using an InitChain, which is usually somewhere near the top of the data, in this case in the variable `D_80C106C0`. This is already included in the `#if`'d out data at the top if the file, so we don't have to do anything for now. We can correct the mips2c output for the extern, though: I actually did this when moving the rest of the data in the previous section.
### SkelAnime
This is the combined system that handles actors' skeletons and their animations. It is the other significant part of most actor structs. We see its initialisation in this part of the code:
```C
Actor_ProcessInitChain(&this->actor, D_80C106C0);
ActorShape_Init(&this->actor.shape, -60.0f, NULL, 0.0f);
@ -394,12 +397,15 @@ This is the combined system that handles actors' skeletons and their animations.
An actor with SkelAnime has three structs in the Actor struct that handle it: one called SkelAnime, and two arrays of `Vec3s`, called `jointTable` and `morphTable`. Usually, although not always, they are next to one another.
There are two different sorts of SkelAnime, although for decompilation purposes there is not much difference between them. Looking at the prototype of `SkelAnime_InitFlex` from `functions.h` (or even the definition in `z_skelanime.c`),
There are two different sorts of SkelAnime, although for decompilation purposes there is not much difference between them. Looking at the prototype of `SkelAnime_InitFlex` from `functions.h` (or even the definition in `z_skelanime.c`),
```C
void SkelAnime_InitFlex(PlayState* play, SkelAnime* skelAnime, FlexSkeletonHeader* skeletonHeaderSeg,
AnimationHeader* animation, Vec3s* jointTable, Vec3s* morphTable, s32 limbCount);
```
we can read off the types of the various arguments:
- The `SkelAnime` struct is at `this + 0x144`
- The `jointTable` is at `this + 0x188`
- The `morphTable` is at `this + 0x218`
@ -407,6 +413,7 @@ we can read off the types of the various arguments:
- Because of how SkelAnime works, this means that the `jointTable` and `morphTable` both have `24` elements
Looking in `z64animation.h`, we find that `SkelAnime` has size `0x44`, and looking in `z64math.h`, that `Vec3s` has size `0x6`. Since ` 0x144 + 0x44 = 0x188 `, `jointTable` is immediately after the `SkelAnime`, and since `0x188 + 0x6 * 0x18 = 0x218`, `morphTable` is immediately after the `jointTable`. Finally, `0x218 + 0x6 * 0x18 = 0x2A8`, and we have filled all the space between the `actor` and `actionFunc`. Therefore the struct now looks like
```C
typedef struct EnRecepgirl {
/* 0x0000 */ Actor actor;
@ -428,36 +435,46 @@ extern AnimationHeader D_06009890;
extern UNK_TYPE D_0600A280;
extern FlexSkeletonHeader D_06011B60;
```
As with the data, these externed symbols should be kept in increasing address order.
They are both passed to the function as pointers, so need `&` to pass the address instead of the actual data. Hence we end up with
```C
SkelAnime_InitFlex(play, &this->skelAnime, &D_06011B60, &D_06009890, this->jointTable, this->morphTable, 24);
```
note that `this->jointTable` and `this->morphTable` are arrays, so are already effectively pointers and don't need a `&`.
### More struct variables: a brief detour into reading some assembly
This function also gives us information about other things in the struct. The only other reference to `this` (rather than `this->actor` or similar) is in
```C
this->unk_2AC = 2;
```
This doesn't tell us much except that at `this + 0x2AC` is a number of some kind. What sort of number? For that we will have to look in the assembly code. This will probably look quite intimidating the first time, but it's usually not too bad if you use functions as signposts: IDO will never change the order of function calls, and tends to keep code between functions in roughly the same place, so you can usually guess where you are.
In this case, we are looking for `this + 0x2AC`. `0x2AC` is not a very common number, so hopefully the only mention of it is in referring to this struct variable. Indeed, if we search the file, we find that the only instruction mentioning `0x2AC` is here:
```mips
/* 0000B0 80C10080 24090002 */ addiu $t1, $zero, 2
/* 0000B4 80C10084 A24902AC */ sb $t1, 0x2ac($s2)
```
`addiu` ("add unsigned immediate") adds the last two things and puts the result in the register in the first position. So this says `$t1 = 0 + 2`. The next instruction, `sb` ("store byte") puts the value in the register in the first position in the memory location in the second, which in this case says `$s2 + 0x2ac = $t1`. We can go and find out what is in `$s2` is: it is set *all* the way at the top of the function, in this line:
```mips
/* 000008 80C0FFD8 00809025 */ move $s2, $a0
```
This simply copies the contents of the second register into the first one. In this case, it is copying the contents of the function's first argument into `$s2` (because it wants to use it later, and the `$a` registers are assumed to be cleared after a function call). In this case, the first argument is a pointer to `this` (well, `thisx`, but the struct starts with an `Actor`, so it's the same address). So line `B4` of the asm really is saving `2` into the memory location `this + 0x2AC`.
Anyway, this tells us that the variable is a byte of some kind, so `s8` or `u8`: if it was an `s16/u16` it would have said `sh`, and if it was an `s32/u32` it would have said `sw`. Unfortunately this is all we can determine from this function: MIPS does not have separate instructions for saving signed and unsigned bytes.
At this point you have two options: guess based on statistics/heuristics, or go and look in the other functions in the actor to find out more information. The useful statistic here is that `u8` is far more common than `s8`, but let's look in the other functions, since we're pretty confident after finding `0x2ac` so easily in `Init`. So, let us grep the actor's assembly folder:
```
$ grep -r '0x2ac' asm/non_matchings/overlays/ovl_En_Recepgirl/
asm/non_matchings/overlays/ovl_En_Recepgirl/EnRecepgirl_Draw.s:/* 00065C 80C1062C 921902AC */ lbu $t9, 0x2ac($s0)
@ -468,7 +485,9 @@ asm/non_matchings/overlays/ovl_En_Recepgirl/func_80C100DC.s:/* 00015C 80C1012C 9
asm/non_matchings/overlays/ovl_En_Recepgirl/func_80C100DC.s:/* 000164 80C10134 A09902AC */ sb $t9, 0x2ac($a0)
asm/non_matchings/overlays/ovl_En_Recepgirl/EnRecepgirl_Init.s:/* 0000B4 80C10084 A24902AC */ sb $t1, 0x2ac($s2)
```
in which we clearly see `lbu` ("load byte unsigned"), and hence this variable really is a `u8`. Hence we can add this to the actor struct too:
```C
typedef struct EnRecepgirl {
/* 0x0000 */ Actor actor;
@ -484,6 +503,7 @@ typedef struct EnRecepgirl {
You might think that was a lot of work for one variable, but it's pretty quick when you know what to do. Obviously this would be more difficult with a more common number, but it's often still worth trying.
Removing some of the declarations for data that we have accounted for, the function now looks like this:
```C
? func_80C10148(EnRecepgirl *); // extern
@ -525,6 +545,7 @@ mips2c likes casting a lot: this is useful for getting types, less so when the t
### Functions called
One minor problem is what `func_80C10148` is: C needs a prototype to compile it properly. mips2c has offered us `? func_80C10148(EnRecepgirl *); // extern`, but this is obviously incomplete: there's no `?` type in C! We shall guess for now that this function returns `void`, for two reasons:
1. It's not used as a condition in a conditional or anything
2. It's not used to assign a value
@ -532,6 +553,7 @@ To this experience will add a third reason:
3. This is probably a setup function for an actionFunc, which are usually either `void (*)(ActorType*)` or `void (*)(ActorType*, PlayState*)`.
The upshot of all this is to remove mips2c's `? func_80C10148(EnRecepgirl *); // extern`, and add a `void func_80C10148(EnRecepgirl* this);` underneath the declarations for the main four functions:
```C
void EnRecepgirl_Init(Actor* thisx, PlayState* play);
void EnRecepgirl_Destroy(Actor* thisx, PlayState* play);
@ -543,12 +565,12 @@ void func_80C10148(EnRecepgirl* this);
(we usually leave a blank line after the main four, and put all further declarations in address order).
### Loops
Loops are often some of the hardest things to decompile, because there are many ways to write a loop, only some of which will generate the same assembly. mips2c has had a go at the one in this function, but it usually struggles with loops: don't expect it to get a loop correct, well, at all.
The code in question is
```C
void **temp_s0;
void **phi_s0;
@ -567,46 +589,55 @@ The code in question is
```
`D_80C106B0` is the array that mips2c has declared above the function, a set of 8-digit hex numbers starting `0x06`. These are likely to be *segmented pointers*, but this is not a very useful piece of information yet. `D_80C106C0` is the InitChain, though, and it seems pretty unlikely that it would be seriously involved in any sort of loop. Indeed, if you tried to compile this now, you would get an error:
```
cfe: Error: src/overlays/actors/ovl_En_Recepgirl/z_en_recepgirl.c, line 61: Unacceptable operand of == or !=
} while (temp_s0 != D_80C106C0);
-------------------------^
```
so this can't possibly be right.
So what on earth is this loop doing? Probably the best thing to do is manually unroll it and see what it's doing each time.
0. `phi_s0 = D_80C106B0`, aka `&D_80C106B0[0]`, to `temp_s0 = D_80C106B0 + 4`, i.e. `&D_80C106B0[1]`. But then `temp_s0->unk-4` is 4 backwards from `&D_80C106B0[1]`, which is back at `&D_80C106B0[0]`; the `->` means to look at what is at this address, so `temp_s0->unk-4` is `D_80C106B0[0]`. Equally, `*phi_s0` is the thing at `&D_80C106B0[0]`, i.e. `D_80C106B0[0]`. So the actual thing the first pass does is
1. `phi_s0 = D_80C106B0`, aka `&D_80C106B0[0]`, to `temp_s0 = D_80C106B0 + 4`, i.e. `&D_80C106B0[1]`. But then `temp_s0->unk-4` is 4 backwards from `&D_80C106B0[1]`, which is back at `&D_80C106B0[0]`; the `->` means to look at what is at this address, so `temp_s0->unk-4` is `D_80C106B0[0]`. Equally, `*phi_s0` is the thing at `&D_80C106B0[0]`, i.e. `D_80C106B0[0]`. So the actual thing the first pass does is
```C
D_80C106B0[0] = Lib_SegmentedToVirtual(D_80C106B0[0]);
```
it then proceeds to set `phi_s0 = &D_80C106B0[1]` for the next iteration.
1. We go through the same reasoning and find the inside of the loop is
2. We go through the same reasoning and find the inside of the loop is
```C
temp_s0 = &D_80C106B0[2];
D_80C106B0[1] = Lib_SegmentedToVirtual(D_80C106B0[1]);
phi_s0 = &D_80C106B0[2];
```
2.
3.
```C
temp_s0 = &D_80C106B0[3];
D_80C106B0[2] = Lib_SegmentedToVirtual(D_80C106B0[2]);
phi_s0 = &D_80C106B0[3];
```
3.
4.
```C
temp_s0 = &D_80C106B0[4];
D_80C106B0[3] = Lib_SegmentedToVirtual(D_80C106B0[3]);
phi_s0 = &D_80C106B0[4];
```
But now, `&D_80C106B0[4] = D_80C106B0 + 4 * 4 = D_80C106B0 + 0x10`, and `0x10` after this array's starting address is `D_80C106C0`, i.e. the InitChhain. Hence at this point the looping ends.
So what this loop actually does is run `Lib_SegmentedToVirtual` on each element of the array `D_80C106B0`.
At this point, I confess that I guessed what this loop does, and rewrote it how I would have written it, namely how one usually iterates over an array:
```C
s32 i;
[...]
@ -650,6 +681,7 @@ void EnRecepgirl_Init(Actor* thisx, PlayState* play) {
func_80C10148(this);
}
```
as our first guess. This doesn't look unreasonable... the question is, does it match?
## Diff
@ -659,9 +691,11 @@ Once preliminary cleanup and struct filling is done, most time spent matching fu
In order to use `diff.py` with the symbol names, we need a copy of the code to compare against. In MM this is done as part of `make init`, and you can regenerate the `expected` directory (which is simply a known-good copy of `build` directory) by running `make diff-init`, which will check for an OK ROM and copy the build directory over. (Of course you need an OK ROM to do this; worst-case, you can checkout master and do a complete rebuild to get it). (You need to remake `expected` if you want to diff a function you have renamed: `diff.py` looks in the mapfiles for the function name, which won't work if the name has changed!)
Now, we run diff on the function name: in the main directory,
```
$ ./diff.py -mwo3 EnRecepgirl_Init
```
(To see what these arguments do, run it with `./diff.py -h` or look in the scripts documentation.)
![FeelsOKMan completely white diff](images/EnRecepgirl_Init_diff_matching.png)

View File

@ -1,12 +1,15 @@
# Getting started
## [Introduction to decomp](introduction.md)
- What we are doing
- Structure of the code
## Pre-decompilation
- [Introduction to git](intro_to_git.md)
- Building the repo (follow the instructions in the [README.md](../../README.md))
- Most of us use VSCode. Some useful information is [here](vscode.md).
- Most of us use VSCode. Some useful information is [here](vscode.md).
<!-- Feel free to document Emacs/Vi/Sublime/whatever if you're familiar with them -->
- Choosing a first actor (You want something small that has simple interactions with the environment. A simple NPC can also work, and is what we will use as an illustration for most of the tutorial. There is a collection of actors we think are suitable for beginners on the spreadsheet or Trello)
@ -35,15 +38,17 @@
- [Documenting a decompiled file](documenting.md)
## [Object Decompilation](object_decomp.md)
- Object files
- How we decompile objects
## After Decompilation
- See the [CONTRIBUTING.md](../../CONTRIBUTING.md) for most of the details for submitting PRs. Remember to format again after making adjustments from reviews!
- See the [CONTRIBUTING.md](../CONTRIBUTING.md) for most of the details for submitting PRs. Remember to format again after making adjustments from reviews!
- More information about specific preparations is in [this document](merging.md).
## Appendices
- [Types, Structs and Padding](types_structs_padding.md) (a miscellany of useful stuff)
- [Advanced control flow](advanced_control_flow.md) (an example of a more complex function which mips2c is not so good at)
- [Using the diff script and the permuter](diff_and_permuter.md) (using the diff script and the permuter to match something)

View File

@ -5,16 +5,20 @@
## Table of Contents
- [Data first](#data-first)
- [Extern and data last](#extern-and-data-last)
- [Segmented pointers and object symbols](#segmented-pointers-and-object-symbols)
- [Fake symbols](#fake-symbols)
- [Inlining](#inlining)
- [Data](#data)
- [Table of Contents](#table-of-contents)
- [Data first](#data-first)
- [Extern and data last](#extern-and-data-last)
- [Segmented pointers and object symbols](#segmented-pointers-and-object-symbols)
- [Fake symbols](#fake-symbols)
- [Inlining](#inlining)
- [Finally: .bss](#finally-bss)
Each actor's data is stored in a separate file. EnRecepgirl's data is in `data/overlays/ovl_En_Recepgirl/ovl_En_Recepgirl.data.s`, for example. At some point in the decompilation process we need to convert this raw data into recognisable information for the C to use.
There are two main ways to do this: either
1. import the data first and type it later, or
There are two main ways to do this: either
1. import the data first and type it later, or
2. wait until the data appears in functions, extern it, then import it at the end
Sometimes something between these two is appropriate: wait until the largest or strangest bits of data appear in functions, get some typing information out of that, and then import it, but for now, let's stick to both of these.
@ -25,16 +29,18 @@ Both approaches have their advantages and disadvantages.
This way is good for smaller actors with little data. The OoT tutorial [covers this in plenty of detail](https://github.com/zeldaret/oot/blob/master/docs/tutorial/data.md), and the process in MM is essentially identical, so we won't go over it here.
## Extern and data last
Externing is explained in detail in the document about the [Init function](beginning_decomp.md). To summarize, every time a `D_address` appears that is in the data file, we put a
```C
extern UNK_TYPE D_address;
```
at the top of the file, in the same order that the data appears in the data file. We can also give it a type if we know what the type actually is (e.g. for colliders, initchains, etc.), and convert the actual data and place it commented-out under the corresponding line. This means we don't have to do everything at once at the end.
Once we have decompiled enough things to know what the data is, we can import it. The advantage of doing it this way is we should know what type everything is already: in our work on EnRecepgirl, for example, we ended up with the following data at the top of the file
```C
#if 0
const ActorInit En_Recepgirl_InitVars = {
@ -61,11 +67,13 @@ static s32 D_80C106C8 = 0;
#endif
```
and the main thing we need to understand is `D_80C106B0`
*Before doing anything else, make sure `make` gives `OK`.*
First, we tell the compiler to ignore the original data file. To do this, open the file called `spec` in the main directory of the repository, and search for the actor name. You will find a section that looks like
```
beginseg
name "ovl_En_Recepgirl"
@ -75,7 +83,9 @@ beginseg
include "build/data/ovl_En_Recepgirl/ovl_En_Recepgirl.reloc.o"
endseg
```
We will eventually remove both of the bottom two lines and replace them with our own reloc file, but for now, just comment out the data line:
```
beginseg
name "ovl_En_Recepgirl"
@ -87,6 +97,7 @@ endseg
```
Next remove all the externs, and uncomment their corresponding commented data:
```C
const ActorInit En_Recepgirl_InitVars = {
ACTOR_EN_RECEPGIRL,
@ -111,20 +122,22 @@ static InitChainEntry D_80C106C0[] = {
static s32 D_80C106C8 = 0;
```
That should be everything, and we should now be able to `make` without the data file with no issues.
That should be everything, and we should now be able to `make` without the data file with no issues.
## Segmented pointers and object symbols
The game has a convenient system that allows it to sometimes effectively use offsets into a file instead of raw memory addresses to reference things. This is done by setting a file address to a *segment*. A segmented address is of the form `0x0XYYYYYY`, where `X` is the segment number. There are 16 available segments, and actors always set segment 6 to their object file, which is a file containing assets (skeleton, animations, textures, etc.) that they use. This is what all those `D_06...` are, and it is also what the entries in `D_80C106B0` are: they are currently raw numbers instead of symbols, though, and we would like to replace them.
There is an obvious problem here, which is that is that these symbols have to be defined *somewhere*, or the linker will complain (indeed, if we change the ones in the array to `D_...`, even if we extern them, we get
There is an obvious problem here, which is that is that these symbols have to be defined *somewhere*, or the linker will complain (indeed, if we change the ones in the array to `D_...`, even if we extern them, we get
```
mips-linux-gnu-ld: build/src/overlays/actors/ovl_En_Recepgirl/z_en_recepgirl.o:(.data+0x20): undefined reference to `D_0600F8F0'
````
```
As we'd expect, of course: we didn't fulfil our promise that they were defined elsewhere.)
For actors which have yet to be decompiled, this is mitigated by use of the file `undefined_syms.txt`, which feeds the linker the raw addresses to use as the symbol definitions. However, we want to replace these segmented addresses with proper object symbols whenever possible. In `En_Recepgirl_InitVars`, we can see that this actor uses the object `OBJECT_BG`:
```c
const ActorInit En_Recepgirl_InitVars = {
ACTOR_EN_RECEPGIRL,
@ -135,6 +148,7 @@ const ActorInit En_Recepgirl_InitVars = {
```
If we open up `assets/objects/object_bg.h`, we can see a bunch of different names corresponding to every asset in the object. You may notice that some of these names look a bit familiar; `object_bg_Tex_00F8F0` seems very close to the segmented address `(void*)0x600F8F0`. This is the proper object symbol for this segmented address, so we should `#include` this header in our actor and use these object symbols like so:
```c
static void* D_80C106B0[4] = { object_bg_Tex_00F8F0, object_bg_Tex_00FCF0, object_bg_Tex_0100F0, object_bg_Tex_00FCF0 };
```
@ -143,21 +157,19 @@ After replacing every segmented pointer with an object symbol, you should go ahe
We will come back and name these later when we do the object.
## Fake symbols
Some symbols in the data have been decompiled wrongly, being incorrectly separated from the previous symbol due to how it was accessed by the actor's functions. However, most of these have now been fixed. Some more detail is given in [Types, structs and padding](types_structs_padding.md) If you are unsure, ask!
## Inlining
After the file is finished, it is possible to move some static data into functions. This requires that:
1. The data is used in only one function
2. The ordering of the data can be maintained
Additionally, we prefer to keep larger data (more than a line or two) out of functions anyway.
# Finally: .bss
A .bss contains data that is uninitialised (actually initialised to `0`). For most actors all you need to do is declare it at the top of the actor file without giving it a value, once you find out what type it is. In `code`, it's much more of a problem.

View File

@ -10,12 +10,10 @@ Variables must be renamed in `tools/disasm/variables.txt`. It may also be necess
You can avoid having to redisassemble every time by running `rename_global_asm.py`, which will rename the individual functions' assembly files in `asm/nonmatchings/` to the name of the function they contain.
## Fake and incorrect symbols
TODO
## Resplitting a file
The files `boot` and `code` are each divided up into dozens of separate files, that are all joined together into one text, data, rodata and bss section when building the ROM. As such, it has been necessary to guess where the file boundaries are, and not every file contains the correct functions or the correct data (rodata is mostly the exception since it is automatically split).
@ -31,6 +29,3 @@ To change a split for a file, find its entry in `tools/disasm/files.txt`, and ch
to the file will extract it correctly as a separate file. It also is necessary to make a new C file and move the `GLOBAL_ASM` declaration into it.
Unfortunately you essentially have to redisassemble after telling the disassembler to resplit a file.
##

View File

@ -17,14 +17,12 @@ If you want to use `diff.py` after renaming anything, particularly functions, re
Finally, *if you are not sure what something does, either ask or leave it unnamed: it will be less confusing later if things are unnamed than if they are wrongly named*
## Renaming things
Because MM needs to regenerate the assembly code, it is necessary to tell the disassembler the names of functions and variables, so it knows what symbols to assign in the code. This is done via `functions.txt` and `variables.txt`. The best way to rename functions and symbols is via global rename in an editor like VSCode. The next best way is to run `tools/rename_sym.sh`. You should be careful with this script: it has no error-checking!
Renaming symbols in theory requires re-disassembly. This can often be avoided in the case of functions by running `tools/rename_global_asm.py`, which will rename any individual functions' assembly files with the wrong names, so that the `GLOBAL_ASM`s can spot them. Renaming variables *may* require redisassembly (and if fake symbols are removed, it *will*).
## EnRecepgirl
Currently, the file looks like this:
@ -262,12 +260,15 @@ void EnRecepgirl_Draw(Actor* thisx, PlayState* play) {
(We can delete the `GLOBAL_ASM` lines now.)
The worst part of documentation is finding somewhere to start. We have a decent place to start here, though, in that we already know the function (or rather, the use) of a couple of the functions, namely the LimbDraws. So we can rename `func_80C10558` to `EnRecepgirl_OverrideLimbDraw` and `func_80C10590` to `EnRecepgirl_TransformLimbDraw`. Remember to do a global rename so that the functions in the assembly are renamed, use `rename_global_asm`,
```
$ ./tools/rename_global_asm.py
asm/non_matchings/overlays/ovl_En_Recepgirl/func_80C10558.s --> asm/non_matchings/overlays/ovl_En_Recepgirl/EnRecepgirl_OverrideLimbDraw.s
asm/non_matchings/overlays/ovl_En_Recepgirl/func_80C10590.s --> asm/non_matchings/overlays/ovl_En_Recepgirl/EnRecepgirl_UnkLimbDraw.s
```
as well as the mentions in this chunk of `functions.txt`:
```
0x80C0FFD0:("EnRecepgirl_Init",),
0x80C100CC:("EnRecepgirl_Destroy",),
@ -283,6 +284,7 @@ as well as the mentions in this chunk of `functions.txt`:
```
That's probably as much as we can do on functions for now. Next let's think about some of the variables. We have essentially 3 sorts of variable here
- struct variables
- data/bss
- intrafunction/stack variables
@ -290,6 +292,7 @@ That's probably as much as we can do on functions for now. Next let's think abou
and this is roughly the order of preference for naming them (although not necessarily the logical order to determine what they do). This actor is quite limited in the last category: only `sp30` is unnamed at the moment. Even though `Actor_TrackPlayer` is decomped, the purpose of the argument in which `sp30` is placed is not clear (and, indeed, is not named), so it's probably best to leave it unnamed for now. (With greater experience, you might analyse `Actor_TrackPlayer` to work out what this argument is for, but let's not worry about that for now.)
As for the struct, there are two unnamed variables at the moment:
```C
typedef struct EnRecepgirl {
/* 0x000 */ Actor actor;
@ -303,12 +306,15 @@ typedef struct EnRecepgirl {
```
Let's start with `unk_2AC`. This is set to `2` in `Init`, something interesting happens to it in `func_80C100DC`, but it is used in the `Draw`, here:
```C
gSPSegment(POLY_OPA_DISP++, 0x08, D_80C106B0[this->unk_2AC]);
```
So it is used as an index into the array `D_80C106B0`, and the element with that index is placed on segment `8`. So we need to work out what this array is to name `unk_2AC`.
As we discussed last time, `D_80C106B0` is an array of [segmented pointers](data.md#segmented-pointers). Since they are in segment `6`, they are in the actor's object file. Which object? The InitVars tell us: namely,
```C
const ActorInit En_Recepgirl_InitVars = {
ACTOR_EN_RECEPGIRL,
@ -316,8 +322,8 @@ const ActorInit En_Recepgirl_InitVars = {
FLAGS,
OBJECT_BG,
```
the fourth element is the object (it is actually an enum, but the file itself has the same name as the object enum). So, we need to look at the object file. We are very lucky that a custom tool has been written for such a thing: Z64Utils.
the fourth element is the object (it is actually an enum, but the file itself has the same name as the object enum). So, we need to look at the object file. We are very lucky that a custom tool has been written for such a thing: Z64Utils.
## Z64Utils
@ -336,14 +342,17 @@ Go to "Analysis -> Find Dlists" and press OK (the defaults are usually fine). Th
![Z64Utils, with an analyzed object](images/z64utils_object_analyzed.png)
We will talk about what all these types of data are next time, but for now, all we want to know is what
```C
static void* D_80C106B0[4] = { object_bg_Tex_00F8F0, object_bg_Tex_00FCF0, object_bg_Tex_0100F0, object_bg_Tex_00FCF0 };
```
actually are. We know they are set on segment 8, so we need to find where the skeleton uses them. We know from `object_bg_Skel_011B60` that this is at `0x06011B60`, so scroll down to it, right-click on it, and choose "Open in Skeleton Viewer". Pick an animation that we know it uses (sometimes Z64Utils misidentifies other things for animations), such as `object_bg_Anim_000968`, and you will get this error:
![Z64Utils, error when viewing skeleton](images/z64utils_skeleton_error.png)
It needs something to be set to segment `8`. Well, that's good, we know that the code does that! Let's find out what. Z64Utils tells you the address, so we can look up the displaylist that wants it: the relevant block is
```C
[...]
// Multi Command Macro Found (6 instructions)
@ -374,11 +383,13 @@ so we see that segment `8` is expecting a texture (we'll go into more detail abo
But what sort of textures? This is an NPC, so what textures on the model would it want to change? The answer is of course the eyes: most NPCs have eye textures, with some sort of routine for changing them to appear to blink. We can set the different textures onto segment `8` and see which is which, but this is enough to know that `D_80C106B0` can be `sEyeTextures` (`s` for `static`: they essentially have to be static so that we can name them like this without the names clashing), and that `unk_2AC` is `eyeTexIndex` (these names are not completely standard, but it's best to be as consistent as possible).
**N.B.** static data should not be renamed in the assembly or `variables.txt`, since assembly has no notion of file locality and there can be symbol clashes. Therefore it should only be renamed in its respective file, not globally.
```C
static TexturePtr sEyeTextures[] = { object_bg_Tex_00F8F0, object_bg_Tex_00FCF0, object_bg_Tex_0100F0, object_bg_Tex_00FCF0 };
```
And now it's rather more obvious what
```C
void func_80C100DC(EnRecepgirl* this) {
if (this->eyeTexIndex != 0) {
@ -391,11 +402,13 @@ void func_80C100DC(EnRecepgirl* this) {
}
}
```
is doing: it's running a kind of blink routine. This is slightly nonstandard: usually there is a separate timer, but this one simply perturbs the index away from `0` every frame with a 2% chance. This sort of function is usually called `Blink` or `UpdateEyes`. Since it is explicitly called in `Update`, we'll call it `UpdateEyes`, but either is fine; we'll standardise later.
We have two other pieces of data. There is a suggested name for the InitChain in the code already; just replace it and replace the first line in the definition.
This leaves one piece of data unnamed, `D_80C106C8`. This is initially set to `0`, checked in `Init` to decide whether to run the loop, and then set to `1` after the loop is finished:
```C
if (D_80C106C8 == 0) {
for (i = 0; i < 4; i++) {
@ -404,6 +417,7 @@ This leaves one piece of data unnamed, `D_80C106C8`. This is initially set to `0
D_80C106C8 = 1;
}
```
What is this doing? We need to understand that to name this variable.
The N64's processors cannot use segmented addresses: they need actual RAM addresses. Therefore the segmented addresses have to be converted before being placed on a segment: this is what `Lib_SegmentedToVirtual` does. So (somewhat unusually) this loop is modifying the addresses in the actor's actual data in RAM. Having converted the addresses once, it wouldn't make any sense to convert them again, but `Init` would run every time an instantiation of the actor is created. Therefore `D_80C106C8` is present to ensure that the addresses only get converted once: it is really a boolean that indicates if the addresses have been converted. So let's call it `texturesDesegmented`, and replace its values by `true` and `false`.
@ -411,6 +425,7 @@ The N64's processors cannot use segmented addresses: they need actual RAM addres
Finally, clearly `4` is linked to the data over which we're iterating: namely it's the size of the array. We have a macro for this, `ARRAY_COUNT(sEyeTextures)`.
We've got one struct variable left. To find out what it does, we can look at a function that uses it, for example
```C
s32 EnRecepgirl_OverrideLimbDraw(PlayState* play, s32 limbIndex, Gfx** dList, Vec3f* pos, Vec3s* rot,
Actor* thisx) {
@ -431,6 +446,7 @@ void EnRecepgirl_UnkLimbDraw(PlayState* play, s32 limbIndex, Actor* thisx) {
}
}
```
It is used to do a rotation of whatever limb `5` is. (The `+=` is because `rot->x` is the base rotation of the limb, and we have to add the same thing to it every frame to keep the angle changed and constant.) We can use Z64Utils to : setting segment `8` to one of what we know now are the eye textures, we can view the model in the skeleton viewer. The limb numbers in the object are one smaller than those in the actor (the root limb is only a concept for the code, not the object), so we find limb 4:
![Z64Utils highlighting a limb](images/z64utils_skeleton_head.png)
@ -530,15 +546,16 @@ void func_80C102D4(EnRecepgirl* this, PlayState* play) {
```
All this branching is to make the conversation look more diverse and interesting. Notably, though, `func_80C1019C` is set to start with, and is only changed when `Actor_ProcessTalkRequest(&this->actor, &play->state) != 0`. This is something to do with talking. The other function handles the rest of the conversation, and hands back to the first if `Message_GetState(&play->msgCtx) == 2`. This function is *something* to do with the text state, which will require `z_message` to be decomped. However, observation in-game will reveal this is something to do with ending dialogue. So we can conclude that the action functions are `EnRecepgirl_Wait` and `EnRecepgirl_Talk`. The setup functions are thus `EnRecepgirl_SetupWait` and `EnRecepgirl_SetupTalk`.
For more complex actors, we have a tool called `graphovl.py` that can produce function flow graphs for actors: running
For more complex actors, we have a tool called `graphovl.py` that can produce function flow graphs for actors: running
```
$ ./tools/graphovl/graphovl.py En_Recepgirl
```
produces
![EnRecepgirl's function flow graph](images/En_Recepgirl.gv.png)
## Miscellaneous other documentation
We like to make macros for reading an actor's `params` (indeed, this is required even if you don't know what the params are for). A simple example is `ObjTree`, which has the following code in its `Init` function:
@ -577,7 +594,6 @@ Notice that we use `thisx`: this makes the form of every one of these macros the
Much clearer!
We have now essentially documented this as far as we can without the object, so we'd better do that next.
Next: [Analysing object files](object_decomp.md)

View File

@ -38,7 +38,6 @@ void EnRecepgirl_Draw(Actor* thisx, PlayState* play) {
}
```
Notable features are the GraphicsContext temps, and blocks of the form
```C
@ -54,14 +53,15 @@ Each of these blocks converts into a graphics macro. They are usually (but not a
For our purposes, we only need one of the programs this provides: `gfxdis.f3dex2`.
Graphics are actually 64-bit on the Nintendo 64. This code block is a result of instructions telling the processor what to do with the graphics pointer. There are two main types of graphics pointer (there are a couple of others used in `code`, but actors will only use these two),
- polyOpa ("opaque") for solid textures
- polyXlu ("Xlucent" i.e. "translucent") for translucent textures
Our example is polyOpa, not surprisingly since our receptionist is solid.
`words.w0` and `words.w1` contain the actual graphics instruction, in hex format. Usually, `w0` is constant and `w1` contains the arguments. To find out what sort of macro we are dealing with, we use `gfxdis.f3dex2`. `w1` is variable, but we need to give the program a constant placeholder. A common word to use is 12345678, so in this case we run
```
gfxdis.f3dex2 -x -g "POLY_OPA_DISP++" -d DB06002012345678
```
@ -73,6 +73,7 @@ gfxdis.f3dex2 -x -g "POLY_OPA_DISP++" -d DB06002012345678
Our standard now is to use decimal colors. If you have a constant second argument rather than a variable one, you can also use `-dc` to get decimal colors instead of the default hex.
The output looks like
```
gSPSegment(POLY_OPA_DISP++, 0x08, 0x12345678);
```
@ -90,6 +91,7 @@ You repeat this for every block in the function.
If you have worked on OoT, you will be aware of the functions `Graph_OpenDisps` and `Graph_CloseDisps`, and might be surprised to see them missing here. These functions are actually a debug feature: the `OPEN_DISPS` and `CLOSE_DISPS` macros still exist, but they don't expand to functions. Of course this means you have to guess where they go. A sensible guess for `OPEN_DISPS` is where the `gfxCtx` temp assignment first happens; `CLOSE_DISPS` is a bit harder, although it's basically just a `}`, so it *shouldn't* matter as much.
It's sensible to eliminate all the `gfxCtx` temps and reintroduce as needed. Also remember to change the prototype and function definition back!
```C
s32 func_80C10558(PlayState* play, s32 limbIndex, Gfx **dList, Vec3f *pos, Vec3s *rot, Actor *actor);
#pragma GLOBAL_ASM("asm/non_matchings/overlays/ovl_En_Recepgirl/func_80C10558.s")
@ -116,11 +118,14 @@ void EnRecepgirl_Draw(Actor* thisx, PlayState* play) {
And this matches.
The last two functions in the actor are used as arguments in `SkelAnime_DrawTransformFlexOpa`. This is a `SkelAnime` function, except unlike the OoT ones, it has three function callback arguments instead of two: in `functions.h` or `z_skelanime.c`, we find
```C
void SkelAnime_DrawTransformFlexOpa(PlayState* play, void** skeleton, Vec3s* jointTable, s32 dListCount,
OverrideLimbDrawOpa overrideLimbDraw, PostLimbDrawOpa postLimbDraw, TransformLimbDrawOpa transformLimbDraw, Actor* actor)
```
The typedefs of the callbacks it uses are in `z64animation.h`:
```C
typedef s32 (*OverrideLimbDrawOpa)(struct PlayState* play, s32 limbIndex, Gfx** dList, Vec3f* pos, Vec3s* rot,
struct Actor* thisx);
@ -132,6 +137,7 @@ typedef void (*PostLimbDrawOpa)(struct PlayState* play, s32 limbIndex, Gfx** dLi
typedef void (*TransformLimbDrawOpa)(struct PlayState* play, s32 limbIndex, struct Actor* thisx);
```
which is where mips2c got them from.
In this case, only two of them are used, and it is these that are the last functions standing between us and a decompiled actor.
@ -139,6 +145,7 @@ In this case, only two of them are used, and it is these that are the last funct
## OverrideLimbDraw, PostLimbDraw, TransformLimbDraw
Well, we don't have a PostLimbDraw here, but as we see from the prototype, it's much the same as the OverrideLimbDraw but without the `pos` argument and no return value.
```C
s32 func_80C10558(PlayState* play, s32 limbIndex, Gfx **dList, Vec3f *pos, Vec3s *rot, Actor *actor) {
if (limbIndex == 5) {
@ -147,7 +154,9 @@ s32 func_80C10558(PlayState* play, s32 limbIndex, Gfx **dList, Vec3f *pos, Vec3s
return 0;
}
```
Only two things to do here: we need to use `EnRecepgirl` to get to `actor + 0x2B0`, and the return value is used as a boolean, so we replace `0` by `false` (`true` means "don't draw the limb", and is hardly ever used).
```C
s32 func_80C10558(PlayState* play, s32 limbIndex, Gfx **dList, Vec3f *pos, Vec3s *rot, Actor *thisx) {
EnRecepgirl* this = THIS;
@ -160,6 +169,7 @@ s32 func_80C10558(PlayState* play, s32 limbIndex, Gfx **dList, Vec3f *pos, Vec3s
```
As for the TransformLimbDraw, it has a much simpler prototype. mips2c gives
```C
void func_80C10590(PlayState* play, s32 limbIndex, Actor *actor) {
if (limbIndex == 5) {
@ -168,10 +178,13 @@ void func_80C10590(PlayState* play, s32 limbIndex, Actor *actor) {
}
}
```
There is only minor cleanup needed here:
There is only minor cleanup needed here:
- recasting the last argument,
- replacing the last argument of `Matrix_RotateYS` by the enum `MTXMODE_APPLY` (which means "use the current matrix instead of starting from a new identity matrix"), and the first argument by `0x400 - this->unk_2AE.x`.
- `(Vec3f *) &actor->focus` to `&actor->focus.pos` (this is the same issue as `(Actor*)this`, where mips2c doesn't climb deep enough into the struct).
```C
void func_80C10590(PlayState* play, s32 limbIndex, Actor *thisx) {
EnRecepgirl* this = THIS;
@ -186,6 +199,7 @@ void func_80C10590(PlayState* play, s32 limbIndex, Actor *thisx) {
## Some more examples: ObjTree
Since EnRecepgirl was a bit light on graphics macros, we will look at an example that has a few more. A nice simple one is `ObjTree_Draw`: the original mips2c output is
```C
void ObjTree_Draw(Actor* thisx, PlayState* play) {
s16 sp36;
@ -225,7 +239,9 @@ void ObjTree_Draw(Actor* thisx, PlayState* play) {
temp_v0_4->words.w0 = 0xDE000000;
}
```
We can see there are four blocks here, although only two different macros:
```C
temp_v0 = temp_s0->polyOpa.p;
temp_s0->polyOpa.p = temp_v0 + 8;
@ -233,12 +249,16 @@ We can see there are four blocks here, although only two different macros:
sp28 = temp_v0;
sp28->words.w1 = Matrix_NewMtx(play->state.gfxCtx);
```
gfxdis gives
```
$ gfxdis.f3dex2 -x -g POLY_OPA_DISP++ -d DA38000312345678
gSPMatrix(POLY_OPA_DISP++, 0x12345678, G_MTX_NOPUSH | G_MTX_LOAD | G_MTX_MODELVIEW);
```
so it becomes
```C
gSPMatrix(POLY_OPA_DISP++, Matrix_NewMtx(play->state.gfxCtx), G_MTX_NOPUSH | G_MTX_LOAD | G_MTX_MODELVIEW);
```
@ -249,11 +269,14 @@ gSPMatrix(POLY_OPA_DISP++, Matrix_NewMtx(play->state.gfxCtx), G_MTX_NOPUSH | G_M
temp_v0_2->words.w1 = (u32) &D_06000680;
temp_v0_2->words.w0 = 0xDE000000;
```
```
$ gfxdis.f3dex2 -x -g POLY_OPA_DISP++ -d DE00000012345678
gSPDisplayList(POLY_OPA_DISP++, 0x12345678);
```
so this one is
```C
gSPDisplayList(POLY_OPA_DISP++, D_06000680);
```
@ -265,16 +288,20 @@ gSPDisplayList(POLY_OPA_DISP++, D_06000680);
sp20 = temp_v0_3;
sp20->words.w1 = Matrix_NewMtx(play->state.gfxCtx);
```
This is the same as the first one. Indeed, it's identical.
```C
temp_v0_4 = temp_s0->polyOpa.p;
temp_s0->polyOpa.p = temp_v0_4 + 8;
temp_v0_4->words.w1 = (u32) &D_060007C8;
temp_v0_4->words.w0 = 0xDE000000;
```
This is the same as the second one, but with a different second word.
Tidying up and inserting `OPEN_DISPS` and `CLOSE_DISPS`, we end up with
```C
void ObjTree_Draw(Actor* thisx, PlayState* play) {
s16 sp36 = (f32) thisx->shape.rot.x;

View File

@ -0,0 +1,402 @@
# Introduction to git for decomp
`git` is a version control system: it allows you to keep different versions of files at the same time. It does this using a tree system:
- A *repository* is directory containing files managed by git.
- A repository has one or more *branches*. A branch can be thought of as a pointer to a specific commit.
- A *commit* is one set of changes, the most basic "unit" when working with git. The key point about git is that it is possible to have several commits based on the same one
- A *remote* is another copy of the same repository, usually on another computer or the Internet.
For example,
```bash
Remote
---o---o---o---o
Local
master
v
---o---o---o---o
\
o---o
^
A
```
Each `o` is a commit. The lines show the commit that each commit was based on: we can see in the local there is one commit with two commits based on it. The local has two branches, `master` and `A`, and they currently both point to the commits furthest along their respective chains.
Throughout this guide, stuff in `SCREAMING_SNAKE_CASE` represents fields for you to fill in with the appropriate text, e.g. FILE should be replaced with a particular filename.
## Setting up git
If you are on an ordinary Linux distribution (i.e. not Arch or similar) or WSL, you probably already have git installed, and if not, can get it by running `sudo apt install git` in a terminal window.
git commits are signed with the committer's name and email (if you don't want an actual email address attached to it, GitHub will generate a fake one for you). To set these up for any repository you work on, run
```bash
git config --global user.name "NAME"
git config --global user.email "EMAIL_ADDRESS"
```
(omitting `--global` will set these only for the repository in the current folder).
To make a new repository in a directory:
```bash
git init
```
The repository is considered to have the same name as the directory in which it lives.
The default branch is set to be called `master`. If you want to call it something else, do
```bash
git init -b DEFAULT_BRANCH_NAME
```
Usually when working on decomp you will instead be cloning a repository from GitHub, though.
## Cloning a repository from GitHub
```bash
git clone REMOTE_URL
```
This will clone the repository associated with that URL into a subdirectory of the one you are currently in, with a name based on the repository name. For example,
```bash
git clone git@github.com:zeldaret/mm.git
```
will clone the MM repository into the subdirectory `mm`.
git will automatically name the original remote as `origin`, so for decomp you probably want to use a different name so your own fork on GitHub is `origin`. You can do this:
```bash
git clone -o REMOTE_NAME REMOTE_URL
```
so
```bash
git clone -o upstream git@github.com:zeldaret/mm.git
```
## Configuring remotes
View all remotes associated to this repository. Each remote has a name and an address.
```bash
git remote -v
```
Rename remote `OLD` to `NEW`
```bash
git remote rename OLD NEW
```
Add a new remote
```bash
git remote add NAME URL
```
For example, a typical workflow to get a repository from GitHub is to fork it, then
```bash
git clone -o upstream git@github.com:zeldaret/mm.git
cd mm
git add git@github.com:yourgithubaccount/mm.git
```
## Managing branches
You should always work on a branch, to retain a clean copy of the repository that you know works on the master/main branch, and to enable you to switch between unrelated work easily.
To list the branches you currently have:
```bash
git branch
```
To make a new branch:
```bash
git branch NEW_BRANCH_NAME
```
To change branch:
```bash
git checkout BRANCH_NAME
```
To make a new branch and change to it in one command:
```bash
git checkout -b NEW_BRANCH_NAME
```
To delete a branch (e.g. if definitely no longer needed)
```bash
git branch -d BRANCH_NAME
```
## Committing
To make git remember your changes, you need to make a commit. Ordinarily a commit applies to only files that are *staged*.
To add files to staging:
```bash
git add FILES
```
To commit staged files:
```bash
git commit
```
This will open a text editor to write a commit message. It is generally expected that commit messages are informative and short. If you want to write the commit message in the terminal instead,
```bash
git commit -m "COMMIT_MESSAGE"
```
To unstage a file (but keep the changes)
```bash
git reset FILE
```
(To unstage everything, `git reset`)
To revert a file to its state at the last commit:
```bash
git reset --hard FILE
```
## Merging and Rebasing
Having worked separately on a branch A, you often want to incorporate changes from branch B into branch A. Considering the tree/commit structure, there are two possible ways to do this:
- Stick the two branches back together at their current commits. This is called *merging* B into A: diagrammatically,
```bash
---o---o---o---o A
\
o---o---o B
```
to
```bash
---o---o---o---o---m A
\ /
o-----o-----o B
```
where `m` is a *merge commit*.
- Go back to the common ancestor of A and B, take all the commits on A since then, and attempt to apply them to the tip of B. This is called *rebasing* A on B.
```bash
---o---o---o---o A
\
o---o---o B
```
to
```bash
---o
\
o---o---o---o---o---o A
^
B
```
(Notice that in neither case is B itself destroyed.)
Both have advantages and disadvantages.
- Merging is conceptually simpler, but generates an additional commit.
- Rebasing results in cleaner history and is usually easier to do because there are fewer changes in each commit, but makes it much harder to follow what has changed between commits before and after: *you should not rebase a branch other people are looking at* (e.g. in GitHub reviews, but also in other collaboration).
If you realise you have made a mistake while still merging, you can run
```bash
git merge --abort
```
or
```bash
git rebase --abort
```
as appropriate to stop attempting to merge/rebase.
As such, if you only want to learn one of these, learn merging.
### Conflict resolution
If the two branches have both touched the same or nearby lines, git will not know which you want to keep. It will therefore pause the merge/rebase and tell you to decide. You will find a section of the file that looks like this:
```bash
<<<<<<< A
Some code
=======
Some other code
>>>>>>> B
```
from which you should pick one (or combine parts of both as appropriate), then delete the rest. (You definitely do not want the git artefacts left in!)
Notice also that the changes are applied in opposite ways in each case: merging B into A, the incoming changes are from B, whereas when rebasing, they come from A. *This means the conflict resolution in one is in the opposite order from the other.*
## Keeping up to date with remotes
To update git's information about remotes:
```bash
git fetch REMOTE_NAME
```
Not specifying `REMOTE_NAME` will fetch from the remote with the branch that tracks the current one.
To fetch a remote branch's changes and merge them into the current branch:
```bash
git pull REMOTE_NAME BRANCH_NAME
```
To fetch and rebase the current branch on a remote branch:
```bash
git pull --rebase REMOTE_NAME BRANCH_NAME
```
It's generally better to use fetch and then merge/rebase separately, though, at least until you're more experienced with git.
To update a remote with changes from the current branch:
```bash
git push
```
To send changes ignoring any changes to the remote branch since local and remote went out of sync,
```bash
git push --force
```
To push a new branch to a remote and set up a branch that *tracks* it on there, you need to specify which remote to push it to:
```bash
git branch -u REMOTE_NAME
```
## Stashing
You may want to store your current changes without actually committing thhem, for example if you're halfway through something but want new changes from a remote. Running
```bash
git stash
```
will save changes to a "stash" that can be applied later, even to a different branch, with
```bash
git stash apply
```
which keeps the stash, or
```bash
git stash pop
```
which applies, and deletes it if application was successful.
You can also name and delete stashes and so on.
## Repository information
To get general information about what branch you are on, staged and modified files:
```bash
git status
```
This will also indicate any files with merge conflicts.
To see a line-by-line description of the changes currently made:
```bash
git diff
```
(you can also diff 2 branches or 2 commits)
git's most general and powerful information command is `log`, which is far too extensive to cover here.
## GitHub
Things are a little different on GitHub compared to working locally.
To make your own copy of a repository, click `Fork`.
Most repositories do not let people push to them directly. Instead, you make a Pull Request (PR) for merging your branch into one of theirs: this also allows other people to review your work. A Pull Request cannot be merged unless it has no conflicting files. We also use a continuous integration system called Jenkins to ensure that the branch's files are correctly formatted, the it builds the rom correctly, and produces no new warnings; this will usually automatically run whenever you push new changes.
*A branch used for a PR is a public branch, so do not rebase it.* Apart from the usual problems with other people looking at your changes, it will detach all the GitHub review comments from where they are in the code.
It is possible to fix merge conflicts on GitHub, but not recommended. GitHub can do a few other git-related things, but most of the time it's just simpler to do it locally.
### Fetch and Merge
To fetch and merge on GitHub, navigate to your personal fork of the repository `(e.g. github.com/<username>/mm)`. On this page there is a bar that will state whether or not the repository is up to date or not. If it is not up to date it will inform what is different. On the right side of that bar there is a drop down menu labeled "Fetch upstream", you can click this and there will be a button to "Fetch and Merge". Once this is done a `git pull` can be done locally to get all the new changes.
### Code Reviews
GitHub has the ability to review changes in a pull request one file at a time. When reviewing, you can review specific changes or comment on the file overall. To start a review:
- Go to the repository and click "Pull Requests"
- Click on a pull request to review
On the conversation tab you can leave general comments and reply to any other comments that have been made.
On the "Files changed" tab you will see the files changed in the pull request. You can change the format of the diff view in this tab by clicking the gear and choosing the unified or split view.
To leave a comment, hover over the line of code where you'd like to add a comment, and click the blue "+" icon. To comment on a block of multiple lines, click and drag the range of lines you wish to comment, then click the blue icon.
Optionally, to suggest a specific change to the line(s) , click the `+/-` icon then edit the text within the suggestion block.
Once finished, click the "Review changes" and type a comment summarizing proposed changes and comments.
### Giving credit to others
If you want to credit other people who have contributed to a PR you have made, you can use the "Co-authored-by"
```bash
NORMAL_COMMIT_MESSAGE
Co-authored-by: GITHUB_USERNAME <GITHUB_EMAIL>
```
Note the blank line and the `<>` around the email. More information on precisely how to do this can be found in GitHub's own docs: <https://docs.github.com/en/pull-requests/committing-changes-to-your-project/creating-and-editing-commits/creating-a-commit-with-multiple-authors>
## More information
This guide has only covered the basics that are required to work with git on a decomp repository. For more information
- Run a command with `--help`
- Ask in Discord
- consult a reference such as https://git-scm.com/docs or https://www.atlassian.com/git/tutorials
*Always ask or research before doing anything drastic, git is sophisticated enough that usually it has a way to resolve problems itself*

View File

@ -73,12 +73,14 @@ A lot of work has already been done on the code to bring it into a format that i
An *actor* is any thing in the game that moves or performs actions or interactions: Link is an actor, enemies are actors, NPCs are actors, props like grass are actors. The vast majority of actors are *overlays*, which means they are loaded only when the game needs them.
In the code, each actor is associated to several files: there is
In the code, each actor is associated to several files: there is
- the main .c file, e.g. `src/overlays/actors/ovl_En_Ms/z_en_ms.c`
- the actor's Header file, e.g. `src/overlays/actors/ovl_En_Ms/z_en_ms.h`
- various .o files that tell the `make` script how to incorporate it into building the ROM,
- various .o files that tell the `make` script how to incorporate it into building the ROM,
and then for undecompiled actors, various assembly (.s) files, generally including:
and then for undecompiled actors, various assembly (.s) files, generally including:
- one for the actor's *data* (this usually includes things like its collision information about how to draw it, and various other stuff that is used in it), e.g. `data/overlays/actors/ovl_En_Ms.data.s`
- one for each function in the actor, e.g. `asm/non_matchings/overlays/actors/ovl_En_Ms/func_809529AC.s`

View File

@ -3,7 +3,6 @@
- Up: [Contents](contents.md)
- Previous: [Documenting](documenting.md)
## Preparing to PR
### Change the `spec`
@ -73,7 +72,6 @@ Run the formatting script `format.sh`, to format the C files in the standard way
To make sure the PR builds correctly with the current master, you need to merge `upstream/master` before you make the PR. This tends to break things, that you have to fix to get it to compile correctly again.
## Pull Requests
Push commits to your fork of the repository on GitHub, and then open a pull request. Name the PR something sensible, like
@ -87,7 +85,6 @@ and so on, although these four tend to cover most cases. Feel free to add a comm
Please also update the status of the file on Trello/the spreadsheet.
### Reviews
Pull requests may be reviewed by anyone (who knows enough about the conventions of the project), and all must be reviewed and approved by two leads and one extra contributor.

View File

@ -74,7 +74,6 @@ The first argument of `gsDPLoadTextureBlock` tells you the offset, the second th
The following is a list of the texture formats the Nintendo 64 supports, with their gfxdis names and ZAPD format names.
| Format name | Typing in `gsDPLoadTextureBlock` | "Format" in xml |
| ----------------------------------------------- | -------------------------------- | --------------- |
| 4-bit intensity (I) | `G_IM_FMT_I, G_IM_SIZ_4b` | i4 |
@ -111,6 +110,7 @@ If in doubt, look at completed objects in the repo, and if still in doubt, ask.
## Tools
We are very fortunate that several nice tools have been written recently that are excellent for documenting asset files:
- [Z64Utils](https://github.com/Random06457/Z64Utils/releases), for looking at displaylists, textures they reference, the skeleton, animations, etc.
- [Texture64](https://github.com/queueRAM/Texture64/releases), for looking at textures in all the common N64 formats (needed since Z64Utils cannot interpret textures not explicitly referenced in displaylists currently)
@ -149,6 +149,7 @@ For CI4 and CI8 textures, you might see improper-looking textures like so:
![An improper-looking CI8 texture](images/broken_texture.png)
The reason this happens is because ZAPD couldn't determine the TLUT for the texture, so it couldn't use the proper palette. To fix this, you can supply a `TlutOffset` to the texture like so:
```xml
<Texture Name="gGiantFaceEyeOpenTex" OutName="giant_face_eye_open" Format="ci8" Width="32" Height="64" Offset="0x5A80" TlutOffset="0x5380" />
```
@ -157,4 +158,4 @@ The reason this happens is because ZAPD couldn't determine the TLUT for the text
Texture animations are new to Majora's Mask, and they can be pretty tricky to understand. Luckily, there's some extensive documentation on how they're structured [here](https://github.com/zeldaret/mm/blob/master/tools/ZAPD/ZAPD/ZTextureAnimation.cpp). One useful thing to remember is that empty texture animations take the form of `00 00 00 06 00 00 00 00`. The process that automatically generated all the object XMLs sometimes failed to recognize this as an empty texture animation, so it puts it in various blobs or fails to account for it at all.
Next: [The merging process](merging.md)
Next: [The merging process](merging.md)

View File

@ -15,6 +15,7 @@ In the resulting window, go to "Analysis -> Find Dlists" and press OK (the defau
![Finding object_dns's SkeletonHeader in Z64Utils](images/z64utils_dns_skeletonheader.png)
When you open the Skeleton Viewer, you'll see a list of animations off to the side. Selecting one of them will display an error that says something like `RENDER ERROR AT 0x06001A98! (Could not read 0x80 bytes at address 08000000)`. This is because one of the display lists in the skeleton is expecting something to be set at segment 8. From the actor, we know that it's expecting the eye textures to be loaded into segment 8 like so:
```c
static TexturePtr D_8092DE1C[] = { &D_060028E8, &D_06002968, &D_060029E8, &D_06002968 };
[...]
@ -32,6 +33,7 @@ Now that we've gotten around the error, we can see what each limb in the skeleto
Note that some limbs don't actually render anything, so sometimes clicking on a limb will not turn anything red; this may indicate a "Root" limb that has no associated display list, or it may indicate something like an eye limb that doesn't have the right textures loaded to display anything in Z64Utils. It may be useful to skip ahead to [Step #5](#step-5-naming-limb-display-lists) to learn how to check if the limb has a display list. If it doesn't have a display list, then it's a "Root" limb that will never be highlighted.
We can now start naming the skeleton and individual limbs. Since we know this particular skeleton is the King's Chamber Deku Guard, we can name the skeleton `gKingsChamberDekuGuardSkel`. For the LimbNone name, we can call it something like `KINGS_CHAMBER_DEKU_GUARD_LIMB_NONE`, and we can name the LimbMax similarly. For the EnumName, we can name it `KingsChamberDekuGuardLimbs`. For each individual limb, we can name them based on what we see in Z64Utils; just make sure to update both the Name and the EnumName. After naming everything, we have something that looks like this:
```xml
<Limb Name="gKingsChamberDekuGuardTorsoLimb" Type="Standard" EnumName="KINGS_CHAMBER_DEKU_GUARD_LIMB_TORSO" Offset="0x2D18" />
<Limb Name="gKingsChamberDekuGuardHeadLimb" Type="Standard" EnumName="KINGS_CHAMBER_DEKU_GUARD_LIMB_HEAD" Offset="0x2D24" />
@ -49,22 +51,26 @@ We can now start naming the skeleton and individual limbs. Since we know this pa
```
Now we can run `./extract_assets.py -s objects/object_dns` to extract the object again, this time with our new names. What can we do with this? Quite a bit actually. In `z_en_dns.h`, we can add this to the top of the file to start using these new names in our code:
```c
#include "objects/object_dns/object_dns.h"
```
Now, we can redefine the `jointTable` and `morphTable` in terms of the limb enum we defined before, like so:
```c
/* 0x22A */ Vec3s jointTable[KINGS_CHAMBER_DEKU_GUARD_LIMB_MAX];
/* 0x278 */ Vec3s morphTable[KINGS_CHAMBER_DEKU_GUARD_LIMB_MAX];
```
We can also use our new skeleton name and limb enum when initialization the skeleton like so:
```c
SkelAnime_Init(play, &this->skelAnime, &gKingsChamberDekuGuardSkel, NULL, this->jointTable, this->morphTable, KINGS_CHAMBER_DEKU_GUARD_LIMB_MAX);
```
Lastly, we can use our limb enum in `EnDns_PostLimbDraw`. Where the code originally had:
```c
if (limbIndex == 2) {
[...]
@ -72,6 +78,7 @@ if (limbIndex == 2) {
```
We can instead write:
```c
if (limbIndex == KINGS_CHAMBER_DEKU_GUARD_LIMB_HEAD) {
[...]
@ -81,11 +88,13 @@ if (limbIndex == KINGS_CHAMBER_DEKU_GUARD_LIMB_HEAD) {
## Step 2: Naming the animations
Now that we have the skeleton figured out, it's time to name all the animations. In the Skeleton Viewer, you can hit the "play" button on any animation to see what it looks like. Note that some objects have multiple skeletons, and selecting an animation that is associated with a different skeleton than the one you're looking at can cause odd behavior. Try to give each animation a descriptive name based on what it looks like. If you're struggling;
- Try viewing the animation in game. In what contexts does this animation play?
- Try analyzing the code for the actor to see when the animation is used. Is this animation ever referenced?
- If you're still really struggling, Majora's Mask 3D contains the original animation names for the majority of animations in the game. These original names can help you figure out what the developers were originally intending. Explaining how to find these animations in MM3D is outside of the scope of this document, so just ask in Discord if you want to try this.
After naming the animations, the end result will look something like this:
```xml
<Animation Name="gKingsChamberDekuGuardDanceAnim" Offset="0x2A8" />
<Animation Name="gKingsChamberDekuGuardFlipAnim" Offset="0x734" />
@ -99,6 +108,7 @@ After naming the animations, the end result will look something like this:
```
Once again, we can run `./extract_assets.py -s objects/object_dns` to extract the object, and we can update the animation names in `z_en_dns.c` to use our new names like so:
```c
static AnimationInfoS sAnimations[] = {
{ &gKingsChamberDekuGuardIdleAnim, 1.0f, 0, -1, ANIMMODE_LOOP, 0 },
@ -117,16 +127,19 @@ static AnimationInfoS sAnimations[] = {
## Step 3: Identifying the blob
In the XML, you may notice undefined blobs like this:
```xml
<!-- <Blob Name="object_dns_Blob_0028E8" Size="0x180" Offset="0x28E8" /> -->
```
You might already have an idea as to what this is based on what you've seen before. Recall that the eye textures are referenced in the actor's code like this:
```c
static TexturePtr D_8092DE1C[] = { &D_060028E8, &D_06002968, &D_060029E8, &D_06002968 };
```
Do you notice how the "28E8" in `D_060028E8` also appears as the Offset in that blob? That's because the blob is just the eye textures; the process for automatically creating the XML wasn't able to figure it out on its own, so we'll need to do it ourselves. But how should we define these textures in the XML? Recall that the eye textures were loaded into segment 8; let's take a look in `object_dns.c` and see if we can find something that uses this segment. This display list has the answer:
```c
Gfx object_dns_DL_001A50[] = {
[...]
@ -137,6 +150,7 @@ Gfx object_dns_DL_001A50[] = {
```
Using `0x08000000` with `gsDPLoadTextureBlock` signals that this display list is expecting a texture in segment 8. What kind of texture is it expecting? We can look at the arguments after the `0x08000000`. It's looking for an RBGA16 texture with dimensions of 8x8, so we can define these textures in the XML like so:
```xml
<Texture Name="object_dns_Tex_0028E8" OutName="tex_0028E8" Format="rgba16" Width="8" Height="8" Offset="0x28E8" />
<Texture Name="object_dns_Tex_002968" OutName="tex_002968" Format="rgba16" Width="8" Height="8" Offset="0x2968" />
@ -144,6 +158,7 @@ Using `0x08000000` with `gsDPLoadTextureBlock` signals that this display list is
```
Now, we just have to name them. In [Step #1](#step-1-naming-the-skeleton-and-limbs), we set segment 8 to one of the eye textures; we can use that same technique with the other two eye textures to see what they are. Like most NPCs, these various eye textures are used for handling blinking, so we can name them based on how open the eye is:
```xml
<Texture Name="gKingsChamberDekuGuardEyeOpenTex" OutName="kings_chamber_deku_guard_eye_open" Format="rgba16" Width="8" Height="8" Offset="0x28E8" />
<Texture Name="gKingsChamberDekuGuardEyeHalfTex" OutName="kings_chamber_deku_guard_eye_half" Format="rgba16" Width="8" Height="8" Offset="0x2968" />
@ -151,6 +166,7 @@ Now, we just have to name them. In [Step #1](#step-1-naming-the-skeleton-and-lim
```
Like with previous steps, we can run `./extract_assets.py -s objects/object_dns` and then update `z_en_dns.c` with our new names:
```c
static TexturePtr sEyeTextures[] = {
gKingsChamberDekuGuardEyeOpenTex,
@ -165,6 +181,7 @@ Note that this step might be tricky to do if multiple things in the actor use th
## Step #4: Naming anything else in the actor
For some actors, there may be a few other things left to name that are directly referenced in the actor's code. In our case, there is one display list that we still need to name:
```c
gSPDisplayList(POLY_OPA_DISP++, &D_06002C48);
```
@ -178,11 +195,13 @@ We can see this is the guard's Deku Flower:
![Showing the guard's Deku Flower in Z64Utils](images/z64utils_dns_deku_flower.png)
We can name the display list as such in the XML:
```xml
<DList Name="gKingsChamberDekuGuardDekuFlower" Offset="0x2C48" />
```
Then, like all steps before, we can run `./extract_assets.py -s objects/object_dns` and then update `z_en_dns.c` with our new name:
```c
gSPDisplayList(POLY_OPA_DISP++, gKingsChamberDekuGuardDekuFlower);
```
@ -194,6 +213,7 @@ Now that we've named everything that's used externally by the actor, we just nee
![Showing the head limb's display list in Z64Utils](images/z64utils_dns_limb_dlist.png)
Another way is to simply check `object_dns.c`. Each limb lists its own display list like this:
```c
StandardLimb gKingsChamberDekuGuardHeadLimb = {
{ 0, 1300, 0 }, KINGS_CHAMBER_DEKU_GUARD_LIMB_STALK - 1, KINGS_CHAMBER_DEKU_GUARD_LIMB_LEFT_FOOT - 1,
@ -202,6 +222,7 @@ StandardLimb gKingsChamberDekuGuardHeadLimb = {
```
Either way you go about it, you should be able to name all the limb display lists like so:
```xml
<DList Name="gKingsChamberDekuGuardRightFootDL" Offset="0x1640" />
<DList Name="gKingsChamberDekuGuardLeftFootDL" Offset="0x16F0" />
@ -222,6 +243,7 @@ Run `./extract_assets.py -s objects/object_dns` once again, since it will help i
### Step #6: Naming remaining textures
With every display list named, it's now a lot easier to name the remaining textures. In the `assets/objects/object_dns/` folder, you can see all the textures in the object as various PNG files. For some of the textures, just looking at them will give you a good idea as to what they should be named. For other textures, it may help to see how the texture is used in the object's display lists. Let's take a look at `object_dns_Tex_002868`, which is only used in one display list:
```c
Gfx gKingsChamberDekuGuardSnoutDL[] = {
[...]
@ -240,6 +262,7 @@ Now, rebuild the game using [the steps described here](object_decomp.md#building
![Our custom mouth texture being shown in-game](images/custom_texture_in_game.png)
This confirms our suspicion that this is indeed the mouth texture, so we can name it as such. We can use similar strategies to name all the other textures like so:
```xml
<Texture Name="gKingsChamberDekuGuardLeafTex" OutName="kings_chamber_deku_guard_leaf" Format="rgba16" Width="32" Height="32" Offset="0x1E68" />
<Texture Name="gKingsChamberDekuGuardBodyTex" OutName="kings_chamber_deku_guard_body" Format="rgba16" Width="16" Height="16" Offset="0x2668" />
@ -249,10 +272,11 @@ This confirms our suspicion that this is indeed the mouth texture, so we can nam
### Step #7: Finishing up
If you have any other unnamed assets, now's the time to identify them. Otherwise, finish up the file by putting a comment at the top above the `<File>` node:
```xml
<Root>
<!-- Assets for the King's Chamber Deku Guards -->
<File Name="object_dns" Segment="6">
```
And we're done! Hopefully, you found this example helpful when decompiling your own objects.
And we're done! Hopefully, you found this example helpful when decompiling your own objects.

View File

@ -10,6 +10,7 @@ At this point we have a choice to make. Either we could follow the main function
## Destroy
Destroy will be a dead end, but we might as well do it now. Usually we would regenerate the context first and apply it to mips2c as with `Init`, but if we look at the assembly...
```mips
glabel EnRecepgirl_Destroy
/* 0000FC 80C100CC AFA40000 */ sw $a0, ($sp)
@ -17,24 +18,30 @@ glabel EnRecepgirl_Destroy
/* 000104 80C100D4 03E00008 */ jr $ra
/* 000108 80C100D8 00000000 */ nop
```
It doesn't seem to do anything. Indeed, chucking it in mips2c,
```
$ ../mips_to_c/mips_to_c.py asm/non_matchings/overlays/ovl_En_Recepgirl/EnRecepgirl_Destroy.s
void EnRecepgirl_Destroy(s32 arg0, ? arg1) {
}
```
so it really does do nothing. It is worth staying on this briefly to understand what is is doing, though. Even with no context, mips2c knows it takes two arguments because it does two saves onto the stack: the calling convention the N64 uses requires the first four arguments be saved from the registers onto the stack, since the registers are expected to be cleared when a function call happens. It's done a bad job of guessing what they are, but that's to be expected: the assembly only tells us they're words. Thankfully we already know in this case, so we can just replace the `GLOBAL_ASM` by
```C
void EnRecepgirl_Destroy(Actor* thisx, PlayState* play) {
}
```
and cross this function off.
## `func_80C10148`
We don't really have a choice now, we have to look at this function. Remake the context (no need to change the function type this time), and run mips2c on the function's assembly file:
```
$ ../mips_to_c/mips_to_c.py asm/non_matchings/overlays/ovl_En_Recepgirl/func_80C10148.s data/ovl_En_Recepgirl/ovl_En_Recepgirl.data.s --context ctx.c
extern AnimationHeader D_0600AD98;
@ -53,16 +60,21 @@ void func_80C10148(EnRecepgirl *this) {
```
This gives us some information immediately: `D_0600AD98` is an `AnimationHeader`, and `func_80C1019C` is set as the action function. This means that we know its type, even though mips2c does not: looking in the header, we see the typedef is
```C
typedef void (*EnRecepgirlActionFunc)(struct EnRecepgirl*, PlayState*);
```
and so we prototype `func_80C1019C` as
```C
void func_80C1019C(EnRecepgirl* this, PlayState* play);
```
at the top (were it above the function we're currently working on, the prototype could eventually be replaced by the function definition itself, but since it isn't, it goes at the top with the others).
There are several rather odd things going on here:
There are several rather odd things going on here:
- `temp_a0` is only used once. As such it's probably fake.
- There's a weird `this = this` that does nothing
- `if (&D_06001384 == this->skelAnime.animation)` is a bit of a funny way to write the condition: it seems more likely it would be the other way round.
@ -70,6 +82,7 @@ There are several rather odd things going on here:
- `func_80C1019C` is already a pointer, so the `&` is ineffectual. Our style is to not use `&` on function pointers.
If we tackle these, we end up with
```C
void func_80C10148(EnRecepgirl* this);
@ -92,14 +105,15 @@ void func_80C10148(EnRecepgirl *this) {
this->actionFunc = func_80C1019C;
}
```
This is a common type of function called a setup (action) function. It runs once and prepares the ground for its corresponding actionfunction to run, whereas the actionfunction is usually run every frame by `Update` (but more on that later). Running `make`, we get OK again.
Again we have only one way to go
## `func_80C1019C`
Remake the context and run mips2c on this function's assembly file. We get
```C
? func_80C10290(EnRecepgirl *); // extern
@ -135,7 +149,9 @@ void func_80C1019C(EnRecepgirl* this, PlayState* play) {
}
}
```
This is a bit juicier! We can do some preliminary cleanup, then worry about the control flow.
- `sp24` does nothing, so is almost certainly fake.
- `temp_a0` is used in 3 different places, but they're all right next to one another and are unlikely to be required since there's no nontrivial calculation or anything happening. Let's remove it too and see what happens.
- We've got another reversed comparison, `&D_0600A280 == this->skelAnime.animation`.
@ -145,6 +161,7 @@ This is a bit juicier! We can do some preliminary cleanup, then worry about the
- Prototype `func_80C10290`: it is reasonable to guess it's another setup function, so `void func_80C10290(EnRecepgirl* this);`.
Changing all these, we end up with
```C
void func_80C10148(EnRecepgirl* this);
void func_80C1019C(EnRecepgirl* this, PlayState* play);
@ -187,12 +204,15 @@ void func_80C1019C(EnRecepgirl* this, PlayState* play) {
}
}
```
If we look with diff.py, we find this matches. But we can replace some of the `return`s by `else`s: generally, we use elses unless
- After an `Actor_Kill`
- Sometimes after setting an actionfunction
- There's no way to avoid an early return
Here, it's debatable whether to keep the first, since `func_80C10290` is likely a setup function. The latter two should be changed to elses, though. For now, let's replace all of them. This leaves us with
```C
void func_80C1019C(EnRecepgirl* this, PlayState* play) {
if (SkelAnime_Update(&this->skelAnime) != 0) {
@ -217,7 +237,9 @@ void func_80C1019C(EnRecepgirl* this, PlayState* play) {
}
}
```
which still matches. Lastly, we have an enum for the output of `Player_GetMask` and other mask-related things: in `z64player.h` we find
```C
typedef enum {
/* 0x00 */ PLAYER_MASK_NONE,
@ -231,10 +253,10 @@ and so we can write the last if as `Player_GetMask(play) == PLAYER_MASK_KAFEIS_M
Again, we have no choice in what to do next.
## `func_80C10290`
Remaking the context and running mips2c gives
```C
void func_80C102D4(EnRecepgirl*, PlayState*); // extern
@ -243,8 +265,8 @@ void func_80C10290(EnRecepgirl *this) {
this->actionFunc = func_80C102D4;
}
```
so all we have to do is add the function prototype for the newest action function. Not surprisingly, this matches without changing anything.
so all we have to do is add the function prototype for the newest action function. Not surprisingly, this matches without changing anything.
## `func_80C102D4`
@ -320,6 +342,7 @@ void func_80C102D4(EnRecepgirl* this, PlayState* play) {
</details>
Well, this is a big one! We get one more extern, for `D_06000968`. A lot of the temps used in the conditionals look fake, with the exception of `temp_v0_2`: because the function is only called once but the temp is used twice, the temp must be real. Removing the others and switching the `animation` conditionals,
```C
void func_80C102D4(EnRecepgirl* this, PlayState* play) {
u8 temp_v0_2;
@ -373,11 +396,14 @@ void func_80C102D4(EnRecepgirl* this, PlayState* play) {
}
}
```
There remains one thing we need to fix before trying to compile it, namely `*(&gSaveContext + 0xF37) & 0x80`. This is really a funny way of writing an array access, because mips2c will get confused about arrays in structs. Opening up `z64save.h`, we find in the `SaveContext` struct that
```C
/* 0x0EF8 */ u8 weekEventReg[100]; // "week_event_reg"
/* 0x0F5C */ u32 mapsVisited; // "area_arrival"
```
so it's somewhere in `weekEventReg`. `0xF37 - 0xEF8 = 0x3F = 63`, and it's a byte array, so the access is actually `gSaveContext.save.weekEventReg[63] & 0x80`. Now it will compile. We also don't use `!= 0` for flag comparisons: just `if (gSaveContext.save.weekEventReg[63] & 0x80)` will do.
Running `./diff.py -mwo3 func_80C102D4` and scrolling down, we discover that this doesn't match!
@ -385,10 +411,13 @@ Running `./diff.py -mwo3 func_80C102D4` and scrolling down, we discover that thi
![First run of diff.py on func_80C102D4](images/func_80C102D4_diff1.png)
The yellow shows registers that don't match, the different colours on the registers help you to estimate where the problems are. Usually it's best to start at the top and work down if possible: any regalloc problems at the top tend to propagate most of the way down. In our case, the first problem is
```
3f0: andi t0,v0,0xff r 153 3f0: andi t1,v0,0xff
```
somehow we skipped over `t0`. Where is this in the code? The `153` in the middle is the line number in the C file (the `3f0`s are the offsets into the assembly file), we have `--source` if you want to see the code explicitly, or you can do it the old-fashioned way, and work it out from nearby function calls. In this case, `func_80C10148` is run straight after, and the only place that is called is
```C
temp_v0_2 = Message_GetState(&play->msgCtx);
if (temp_v0_2 == 2) {
@ -408,12 +437,12 @@ Notice that indeed the subsequent regalloc, which might have looked like a bigge
And now we've run out of functions. Time for `Update`.
## Update
Update runs every frame and usually is responsible for the actor's common logic updates: for example, updating timers, blinking, updating collision, running the `actionFunc`, and so on, either directly or through other functions it calls. A lot of subsidiary functions that are not common to every state (e.g. updating position, or the text when talking, etc.) are carried out by one of the action functions we have already decomped.
Remake the context and run mips2c:
```C
? func_80C100DC(EnRecepgirl *); // extern
@ -426,9 +455,11 @@ void EnRecepgirl_Update(Actor* thisx, PlayState* play) {
func_80C100DC(this);
}
```
If we search for `func_80C100DC`, we find that this is the only time it is used. Hence we can be almost certain that its prototype is `void func_80C100DC(EnRecepgirl* this);`. This function occurs above `Update`, so you can put the prototype next to the `GLOBAL_ASM` and remove it when we decompile that function.
Change the function and the prototype back to `Actor* thisx`, and add the casting temp:
```C
void func_80C100DC(EnRecepgirl *);
#pragma GLOBAL_ASM("asm/non_matchings/overlays/ovl_En_Recepgirl/func_80C100DC.s")
@ -444,16 +475,21 @@ void EnRecepgirl_Update(Actor* thisx, PlayState* play) {
func_80C100DC(this);
}
```
Now, our problem is `Actor_TrackPlayer`. The arguments all look terrible! Indeed, if we look at the actual function in `src/code/code_800E8EA0.c` (found by searching), we find that it should be
```C
s32 Actor_TrackPlayer(PlayState* play, Actor* actor, Vec3s* headRot, Vec3s* torsoRot, Vec3f focusPos)
```
So mips2c has made a bit of a mess here:
- the third argument should be a `Vec3s`. Hence `this + 0x2AE` is a `Vec3s*`, and so `this->unk_2AE` is a `Vec3s`
- `&sp30` is a `Vec3s*`, so `sp30` is a `Vec3s` (it's clearly not used for anything, just used to "dump" a side-effect of the function)
- The last argument is supposed to be an actual `Vec3f`
Fixing all of this, we end up with
```C
void EnRecepgirl_Update(EnRecepgirl* this, PlayState* play) {
EnRecepgirl* this = THIS;
@ -464,7 +500,9 @@ void EnRecepgirl_Update(EnRecepgirl* this, PlayState* play) {
func_80C100DC(this);
}
```
and can fill in the top end of the struct:
```C
typedef struct EnRecepgirl {
/* 0x0000 */ Actor actor;
@ -494,6 +532,7 @@ void EnRecepgirl_Update(Actor* thisx, PlayState* play) {
func_80C100DC(this);
}
```
and this now matches.
**N.B.** sometimes using an actual `PlayState* play` temp is required for matching: add it to your bag o' matching memes.
@ -516,6 +555,7 @@ Anyway, back to EnRecepgirl. 4 functions to go...
## `func_80C100DC`
This is the final non-draw function. You know what to do now: remake the context and run mips2c:
```C
void func_80C100DC(EnRecepgirl *this) {
u8 temp_t6;
@ -547,24 +587,30 @@ Well, it's still *pretty* close. But the registers are all wrong. Firstly, `temp
![func_80C100DC, second diff](images/func_80C100DC_diff2.png)
It's not obvious that did much: it even looks a bit worse.
```C
temp_v0 = this->unk_2AC;
temp_t6 = temp_v0 + 1;
if (temp_v0 != 0) {
this->unk_2AC = temp_t6;
```
may remind you of that loop we decompiled, where mips2c unnecessarily made two temps. Let's walk through what this does.
- First, it saves the value of `this->unk_2AC` into `v0`
- Then, it adds one to it and stores it in `t6`.
- It checks if the first saved value is zero
- If it is, it sets `this->unk_2AC` to the incremented value and carries on.
Well, if we allow ourselves to bend the order of operations a little, there's a much simpler way to write this with no temps, namely
```C
if (this->unk_2AC != 0) {
this->unk_2AC++;
```
So let's try removing both temps:
```C
void func_80C100DC(EnRecepgirl *this) {
if (this->unk_2AC != 0) {
@ -586,6 +632,7 @@ void func_80C100DC(EnRecepgirl *this) {
There we go.
Even though this matches, it is not quite according to our style: remember what was said earlier about early returns. Here, both of them can be removed and replaced by a single else without affecting matching:
```C
void func_80C100DC(EnRecepgirl *this) {
if (this->unk_2AC != 0) {
@ -598,6 +645,7 @@ void func_80C100DC(EnRecepgirl *this) {
}
}
```
and this is how we prefer it to be written.
With that, the last remaining function is `EnJj_Draw`. Draw functions have an extra layer of macroing that is required, so we shall cover them separately.

View File

@ -1,10 +1,10 @@
# Types, structs, and padding
Reminders:
- In N64 MIPS, 1 word is 4 bytes (yes, the N64 is meant to be 64-bit, but it mostly isn't used like it in MM or OoT)
- A byte is 8 bits, or 2 hex digits
## Types
The following are the common data types used everywhere:
@ -14,7 +14,7 @@ The following are the common data types used everywhere:
| char | 1 byte | character |
| u8 | 1 byte | unsigned byte |
| s8 | 1 byte | signed byte |
| u16 | 2 bytes | unsigned short |
| u16 | 2 bytes | unsigned short |
| s16 | 2 bytes | signed short |
| u32 | 4 bytes/1 word | unsigned int |
| s32 | 4 bytes/1 word | signed int |
@ -27,16 +27,18 @@ A pointer is sometimes mistaken for an `s32`. The last two, marked with `^`, are
`s32` is the default thing to use in the absence of any other information about the data.
Useful data for guessing types:
- `u8` is about 7 times more common than `s8`
- `s16` is about 16 times more common than `u16`
- `s32` is about 8 times more common than `u32`
Another useful thing to put here: the typedef for an action function is
```C
typedef void (*ActorNameActionFunc)(struct ActorName*, PlayState*);
```
where you replace `ActorName` by the actual actor name as used elsewhere in the actor, e.g. `EnRecepgirl`. In MM these typedefs have been automatically generated, so you don't need to constantly copy from here or another actor any more.
where you replace `ActorName` by the actual actor name as used elsewhere in the actor, e.g. `EnRecepgirl`. In MM these typedefs have been automatically generated, so you don't need to constantly copy from here or another actor any more.
## Some Common Structs
@ -45,8 +47,8 @@ Here are the usual names and the sizes of some of the most common structs used i
| ----------------------- | --------------------- | --------------- |
| `Actor` | `actor` | 0x144 |
| `DynaPolyActor` | `dyna` | 0x15C |
| `Vec3f` | | 0xC |
| `Vec3s` | | 0x6 |
| `Vec3f` | | 0xC |
| `Vec3s` | | 0x6 |
| `SkelAnime` | `skelAnime` | 0x44 |
| `Vec3s[limbCount]` | `jointTable` | 0x6 * limbCount |
| `Vec3s[limbCount]` | `morphTable` | 0x6 * limbCount |
@ -59,7 +61,6 @@ Here are the usual names and the sizes of some of the most common structs used i
Note that `Actor` and `DynaPolyActor` have changed size from OoT.
## Padding
### Alignment
@ -70,7 +71,7 @@ The clearest example of this is that variables with types that are 1 word in siz
### Struct padding
In actor structs, this manifests as some of the char arrays not being completely replaced by actual variables.
In actor structs, this manifests as some of the char arrays not being completely replaced by actual variables.
```C
typedef struct EnRecepgirl {
@ -123,7 +124,8 @@ Each section is 0x10/16-aligned (qword aligned), i.e. each new section begins at
#### Padding at the end of .text (function instructions)
In function instructions, this manifests as a set of `nop`s at the end of the last function: for example, in EnRecepGirl,
```
```mips
/* 0006B0 80C10680 27BD0038 */ addiu $sp, $sp, 0x38
/* 0006B4 80C10684 03E00008 */ jr $ra
/* 0006B8 80C10688 00000000 */ nop
@ -139,7 +141,8 @@ Once the rest of the functions match, this is automatic. So you never need to wo
In data, the last entry may contain up to 3 words of 0s as padding. These can safely be removed when migrating data, but make sure that you don't remove something that actually is accessed by the function and happens to be 0!
For example, in `ObjTree` we found that the last symbol in the data,
```
```mips
glabel D_80B9A5BC
/* 00006C 80B9A5BC */ .word 0x08000000
/* 000070 80B9A5C0 */ .word 0x00000000
@ -147,6 +150,7 @@ glabel D_80B9A5BC
/* 000078 80B9A5C8 */ .word 0x00000000
/* 00007C 80B9A5CC */ .word 0x00000000
```
had 2 words of padding: only the first 3 words are actually used in the `CollisionCheckInfoInit2`.
### Padding within the .data section

View File

@ -1,6 +1,6 @@
# VSCode
A lot of people on this project use VSCode as their coding environment.
A lot of people on this project use VSCode as their coding environment.
## Extensions
@ -13,10 +13,7 @@ There are a number of useful extensions available to make work more efficient:
- ~~bracket pair colorizer 2~~ (now obsolete due to VSCode's built-in bracket colouring)
- Better MIPS Support
## Useful stuff to know:
## Useful stuff to know
- Ctrl + Alt + Up/Down (on Windows, on Linux it's Ctrl + Shift + Up/Down or Shift + Alt + Up/Down) gives multicursors across consecutive lines. If you want several cursors in a more diverse arrangement, middle clicking works, at least on Windows.
- Alt + Up/Down moves lines up/down.