scummvm/backends/platform/atari
2023-03-04 23:38:42 +01:00
..
atari_ikbd.S
module.mk
native_features.cpp
osystem_atari.cpp BACKENDS: ATARI: Use "saves" directory for savegames 2023-03-04 23:38:42 +01:00
osystem_atari.h
readme.txt

ScummVM
=======

This is a new port of ScummVM (https://www.scummvm.org), a program which allows
you to run certain classic graphical adventure and role-playing games, provided
you already have their data files.

You can find a full list with details on which games are supported and how well
on the compatibility page: https://www.scummvm.org/compatibility.


Yet another port?
-----------------

Yes, I am aware of the official Atari/FreeMiNT port done by KeithS over the
years (https://docs.scummvm.org/en/v2.6.1/other_platforms/atari.html). It is
even updated every release and put on the official ScummVM website. That port
is basically just a recompiled SDL backend for our platform - that certainly
has some advantages (works in GEM, can be easily compiled for the FireBee etc.)
but I have decided to take a different route:

- Reduced executable size, basically whatever is not essential or plausible on
  our platform is left out. That reduces the file size to half. See also the
  next point.

- Because there's a limited horsepower available on our platform, features like
  hi-res 16bpp graphics, software synthesizers, scalers, real-time software
  MP3/OGG/FLAC playback etc., are omitted. This saves memory and disk space,
  making the whole port more lightweight.

- This port natively talks to the hardware, avoiding intermediate layers like
  SDL. Thus, it has more optimisations, fewer redraws, fewer data copying and
  is less crash-prone.

- Because we limit scope only to 8bpp games, it opens a door to more thorough
  testing and there is a certain interest in this in the community. 16bpp games
  could be played only in ARAnyM or similar, limiting the test audience a lot.

After I had seen how snappy NovaCoder's ScummVM on the Amiga is (who coded
his own backend), I decided to do the same and see whether I could do better.
And I could!


Main features
-------------

- Optimized for the Atari Falcon (ideally with the CT60/CT63/CT60e but for the
less hungry games even a CT2/DFB@50 MHz or the AfterBurner040 could be enough).

- Full support for the SuperVidel, incl. the SuperBlitter (!)

- Removed features found too demanding for our platform; the most visible
  change is the exclusion of the 16bpp games (those are mostly hi-res anyway)
  but games in 640x480@8bpp work nicely.

- Direct rendering and single/double/triple buffering support.

- Custom (and optimal) drawing routines (especially for the cursor).

- Custom (Super)Videl resolutions for the best possible performance and visual
  experience (320x240 in RGB, chunky modes with SuperVidel, 640x480@16bpp for
  the overlay in RGB/SuperVidel, ...)

- Custom (hardware based) aspect ratio correction (!)

- Support for PC keys (page up, page down, pause, F11/F12, ...) and mouse wheel
  (Eiffel/Aranym only)

- Still without any assembly optimizations...

This makes such games as The Curse of Monkey Island better playable (on
SuperVidel nearly always also with CD (WAV) music and speech). Also, AdLib
emulation works nicely with many games without noticeable slow downs.


Platform-specific features outside the GUI
------------------------------------------

Keyboard shortcut "CONTROL+u": immediate mute on/off toggle (disables also
sample mixing, contrary to what "Mute all" in the options does!)

Keyboard shortcut "CONTROL+ALT+a": immediate aspect ratio correction on/off
toggle.

"output_rate" in scummvm.ini: sample rate for mixing, can be 49170, 32780,
24585, 19668, 16390, 12292, 9834, 8195 (the lower the value, the faster the
mixing but also in worse quality). Default is 24585 Hz (16-bit, stereo).

"output_samples" in scummvm.ini: number of samples to preload. Default is 2048
which equals to about 83ms of audio lag and seems to be about right for most
games on my CT60@66 MHz.

If you want to play with those two values, the rule of thumb is: (lag in ms) =
(output_samples / output_rate) * 1000. But it's totally OK just to double the
samples value to get rid of stuttering in a heavier game.


Graphics modes
--------------

This topic is more complex than it looks. ScummVM renders game graphics using
rectangles and this port offers following options to render them:

Direct rendering (vsync on/off) - present only with the SuperVidel
Single buffering (vsync on/off)
Double buffering (vsync always on, the checkbox is ignored)
Triple buffering (vsync always off, the checkbox just selects a different kind)

Direct rendering:
~~~~~~~~~~~~~~~~~

This is direct writing of the pixels into (SuperVidel's) screen buffer. Since
the updates are supplied as rectangles and not the whole screen there's no way
to implement direct writing *and* double/triple buffering. Vsync() only
synchronizes the point when the rendering process begins - if it takes more
than the time reserved for the vertical blank interrupt (what happens
with most of the games), you'll see screen tearing.

Pros:

- fastest possible rendering (especially in 640x480 with a lot of small
  rectangle updates where the buffer copying drags performance down)

Cons:

- screen tearing in most cases

- SuperVidel only: using C2P would be not only suboptimal (every rectangle
  would be C2P'ed instead of just copy and C2P of the final screen) but poses an
  additional problem as C2P requires data aligned on a 16px boundary and
  ScummVM supplies arbitrarily-sized rectangles (this is solvable by custom
  Surface allocation but it's not bullet-proof). In theory I could implement
  direct rendering for the Falcon hicolor (320x240@16bpp) but this creates
  another set of issues like when palette would be updated but not the whole
  screen - so some rectangles would be rendered in old palette and some in new.

SuperBlitter used: sometimes (when ScummVM allocates surface via its create()
function; custom/small buffers originating in the engine code are still copied
using the CPU).

Single buffering:
~~~~~~~~~~~~~~~~~

This is very similar to the previous mode with the difference that the engine
uses an intermediate buffer for storing the rectangles but yet it remembers
which ones they were. It works also on plain Videl and applies the chunky to
planar process to each one of the rectangles separately, avoiding fullscreen
updates (but if such is needed, there is an optimized code path for it). Vsync()
is used the same way as in the previous mode, i.e. screen tearing is still
possible.

Pros:

- second fastest possible rendering

- doesn't update the whole screen (works best with a moderate amount of
  rectangles to update)

Cons:

- screen tearing in most cases

- if there are too many smaller rectangles, it can be less efficient than
  updating the whole buffer at once

SuperBlitter used: yes, for rectangle blitting to screen and cursor restoration.
Sometimes also for generic copying between buffers (see above).

Double buffering:
~~~~~~~~~~~~~~~~~

The most common rendering mode. It extends the idea of single buffering - it
renders into two buffers, one is visible while the other one is used for
updating. At the end of the update process the two buffers are swapped, so the
newly updated one is displayed. By definition, Vsync() must be always enabled
(the buffers are swapped in the vertical blank handler) otherwise you'd see
screen tearing.

Pros:

- stable frame rate, leading to fixed e.g. 30 FPS rendering for the whole time
  if game takes, say, 1.7 - 1.9 frames per update

- no screen tearing in any situation

Cons:

- since two buffers are present, the buffer is always blitted into the screen
  surface as whole, even if only one tiny little rectangle is changed (excluding
  the cursor)

- frame rate is set to 60/30/15/etc FPS so you can see big irregular jumps
  between 30 and 15 FPS for example; this is happening when screen updates take
  variable amount of time but since Vsync() is always called, the rendering
  pipeline has to wait until the next frame even if only 1% of the frame time
  has been used.

SuperBlitter used: yes, for rectangle blitting to screen and cursor restoration.
Sometimes also for generic copying between buffers (see above).

Triple buffering:
~~~~~~~~~~~~~~~~~

Best of both worlds - screen tearing is avoided thanks to using of multiple
buffers and the rendering pipeline doesn't have to wait until Vsync(). The vsync
flag is used only to differentiate between two (very similar) modes of
operation:

1. "True triple buffering" as described in
https://en.wikipedia.org/wiki/Multiple_buffering#Triple_buffering (vsync on)

2. "Swap chain" as described in https://en.wikipedia.org/wiki/Swap_chain (vsync
off)

Pros:

- best compromise between performance and visual experience

- works well with both higher and lower frame rates

Cons:

- since three buffers are present, the buffer is always blitted into the screen
  surface as whole, even if only one tiny little rectangle is changed (excluding
  the cursor)

- slightly irregular frame rate (depends solely on the game's complexity)

- in case of extremely fast rendering in 1.), one or more buffers are
  dropped in favor of showing only the most recent one (unlikely)

- in case of extremely fast rendering in 2.), screen tearing is possible
  because the rendering pipeline starts overwriting the buffer which is
  currently displayed (unlikely)

SuperBlitter used: yes, for rectangle blitting to screen and cursor restoration.
Sometimes also for generic copying between buffers (see above).

Triple buffering with vsync on is the default mode for this port.


SuperVidel and SuperBlitter
---------------------------

As mentioned, this port uses SuperVidel and its SuperBlitter heavily. That
means that if the SuperVidel is detected, it does the following:

- patches all 8bpp VGA resolutions to chunky ones, rendering all C2P routines
  useless

- patches all surface addresses with OR'ing 0xA0000000, i.e. using SV RAM
  instead of slow ST RAM (and even instead of TT RAM for allowing pure
  SuperBlitter copying)

- when SuperVidel FW version >= 9 is detected, the async FIFO buffer is used
  instead of the slower sync blitting (where one has to wait for every
  rectangle blit to finish). This applies only for chunky buffer -> screen
  surfaces copy (as the generic surface copy can't rely on this behavior) but
  despite this limitation it sometimes leads to nearly zero-cost rendering
  and makes a *huge* difference for 640x480 fullscreen updates.


Performance considerations/pitfalls
-----------------------------------

It's important to understand what affects performance on our limited platform
to avoid unpleasant playing experiences.

Game engines with unexpected performance hit
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

A typical example from this category is the Gobliins engine (and its
sequels). At first it looks like our machine / backend is doing something
terribly wrong but the truth is it is the engine itself which is doing a lot of
unnecessary redraws and updates, sometimes even before reaching the backend.
The only real solution is to profile and fix the engine.

Too many fullscreen updates
~~~~~~~~~~~~~~~~~~~~~~~~~~~

Somewhat related to the previous point - sometimes the engine authors didn't
realize the impact of every update on the overall performance and instead of
updating only the rectangles that really had changed, they ask for a full screen
update. Not a problem on a >1 GHz machine but very visible on Atari! Also, this
is (by definition) the case of animated intros, especially those in 640x480.

MIDI vs. AdLib vs. sampled music
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

It could seem that sample music replay must be the most demanding one but on the
contrary! _Always_ choose a CD version of a game (with *.wav tracks) to any
other version. With one exception: if you have a native MIDI device able to
replay the given game's MIDI notes (using the STMIDI plugin). MIDI emulation
(synthesis) can easily take down as many as 10 FPS.

CD music slows everything down
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Some games use separate audio *and* video streams (files). Even if the CPU is
able to handle both, the bottleneck becomes ... disk access. This is visible in
The Curse Of Monkey Island for example -- there's audible stuttering during the
intro sequence (and during the game as well). Increasing "output_samples" makes
the rendering literally crawling! Why? Because disk I/O is busy with loading
even *more* sample data so there's less time for video loading and rendering.
Try to put "musdisk1.bun" and "musdisk2.bun" into a ramdisk (i.e. u:/ram in
FreeMiNT), you'll be pleasantly surprised with the performance boost gained.

"Mute" vs. "Mute all" in GUI vs. "No music" in GUI
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Not the same thing. "Mute" (available only via the shortcut CONTROL+u) generates
an event to which the sample mixer can react (i.e. stop mixing the silence...).

"Mute all" doesn't generate anything, it basically just lowers the volume of the
music to zero.

"No music" means using the null audio plugin which prevents generating any MIDI
music (and therefore avoiding the expensive synthesis emulation) but beware, it
doesn't affect CD (*.wav) playback at all!

So for the best performance, always choose "No music" in the GUI options when
the game contains MIDI tracks and "Mute" when the game contains a sampled
soundtrack.

Please note that it is not that bad, you surely can play The Secret of Monkey
Island with AdLib enabled (but the CD/talkie versions sound better and
are cheaper to play ;)).

Vsync in GUI
~~~~~~~~~~~~

Carefully with the vsync option. It can easily cripple direct/single buffer
rendering by 10-15 FPS if not used with caution. That happens if a game takes,
say, 1.2 frames per update (so causing screen tearing anyway and rendering the
option useless) but Vsync() forces it to wait 2 full frames instead.

By the way, the vsync flag in Global Options affects also the overlay rendering
(with all the pitfalls which apply to the single buffering mode)

Slow GUI
~~~~~~~~

Themes handling is quite slow - each theme must be depacked, each one contains
quite a few XML files to parse and quite a few images to load/convert. That's
the reason why the built-in one is used as default, it dramatically speeds up
loading time. A compromise solution is to depack the theme in an equally named
directory (i.e. avoiding the depacking phase) but you need a filesystem with
long name support for that to work.


Known issues
------------

- aspect ratio correction works on RGB only (yet)

- SuperVidel's DVI output is stretched when in 320x200 or 640x400; I'll  wait
  for other people's experiences, maybe only my LCD is so lame.

- adding a game in TOS and loading it in FreeMiNT (and vice versa) generates
  incompatible paths. Either use only one system or edit scummvm.ini and set
  there only relative paths (mintlib bug/limitation).

- the talkie version of MI1 needs to be merged from two sources: first generate
  the DOS version and then additionally also the flac version. Then convert all
  *.flac files into *.wav and replace monkey.sof (flac) with monster.sou (DOS).
  And of course, don't forget to set the extra path in Game options to the
  folder where *.wav files are located! For MI2 just use the DOS version,
  there are no CD tracks available. :(


Future plans
------------

- aspect ratio correction for VGA/SuperVidel

- unified file paths in scummvm.ini

- 8bpp overlay (and get rid of all that 16bpp handling code)

- profiling :) (see also https://github.com/scummvm/scummvm/pull/2382)

- DSP-based sample mixer

- avoid loading music/speech files (and thus slowing down everything) if muted

- assembly copy routines for screen/chunky surfaces (even with SuperVidel
  present it is not possible to use the SuperBlitter for every surface)

- cached audio/video streams (i.e. don't load only "output_samples" number of
  samples but cache, say, 1 second so disk i/o wont be so stressed)

- using LDG or Thorsten Otto's sharedlibs: https://tho-otto.de/sharedlibs.php
  for game engine plugins to relieve the huge binary size

- reuse modified rects in double/triple buffer in the next frame - that way we
  wouldn't need to refresh the whole screen in every case

- add support for the TT030; this would be easily possible when I rewrite the
  renderer with a more flexible resolution switching

- ignore (queue) updateScreen() calls to avoid aggressive drawing / buffer
  switching from some engines; update every X ms instead

- don't hardcode some of the buffers for cacheing purposes, determine the size
  based on amount of free RAM

- true audio CD support via MetaDOS API

- OPL2LPT and Retrowave support (if I manage to purchase it somewhere)

Closing words
—------------

I have opened a pull request with all of my code
(https://github.com/scummvm/scummvm/pull/4687) so who knows, maybe ScummVM
2.8.0 for Atari will be already present on the official website. :-)


MiKRO / Mystic Bytes, XX.XX.2023
Kosice / Slovakia
miro.kropacek@gmail.com
http://mikro.atari.org