Implement several performance improvements:
1. Don't run addr2line for the whole binary.
Frequently only a small part of the vmlinux is covered,
running addr2line over whole binary ahead of time takes insane amount of time.
Instread run addr2line incrementally only for symbols that have any coverage.
2. Run addr2line in parallel.
3. Instead of running objdump -d on the whole object file to find
coverage points, look for call instructions in the .text section directly.
Currently this is implemented only for amd64.
Also this Go change cuts another 7 seconds:
f92c64045f
(faster interation over DWARF compile units, should speed up syz-check as well).
Update #2006