Ordinarily, such a ROM swap would not be particularly difficult, but the SP12 uses 12-bit samples instead of the typical 8 or 16-bit. As such, the samples are split over multiple ROMs and encoded in a way which makes replacement less than straight-forward. To that end, decoder/encoder utilities are necessary to convert to/from 16-bit wave files in order to edit the sounds correctly.
Note: It is possible to simply replace only the most-significant-bit ROMs, but this limits the bit-depth to a mere 8 bits and the substandard result is particularly noticeable on content like 808 bass drums, sine waves, and other sounds especially lacking in harmonic overtones. This is a lesser quality ‘hack’ and does not take advantage of the full dynamic range of the SP12.
The encoder utility automates most or all of the tedious format and editing considerations. In fact, the encoder utility is supplied individual waves for all instrument sounds, each of which is automatically converted into the proper format, trimmed to the correct length, and placed at the correct offset without user intervention.
In practice, creating a custom SP12 kit requires little more than selecting samples of roughly the same time length and style as the original sounds being replaced. In other words, it’s best not to choose a 3 second long ride cymbal as a replacement for the rimshot, as the ride is much too long and will be tied to the filtered outputs instead of the unfiltered outputs.
The decoder utility is mostly for double-checking the output of the encoder prior to EPROM burning. However, a fringe use case would be to create direct-from-ROM samples from any set of SP12 sound ROMs (factory or custom), such as to load into Kontakt or another hardware sampler.
Currently available kits: E-mu SP12, Roland TR-808, and Sequential Circuits TOM
Note: This will bore most people to death; only developers are likely to find it at all interesting.
The SP12 contains six 23256 type sound ROMs:
MS1 (IC09) – Electric Snare, Rimshot, Cowbell, Tom
MS2 (IC27) – Bass/Kick Drum, Electric Tom, Hi-Hat, Clap
MS3 (IC10) – Ride, Snare
MS4 (IC28) – Crash
LS12 (IC58) – corresponds to MS1 and MS2
LS34 (IC59) – corresponds to MS3 and MS4
Samples are unsigned, mono, 12-bit linear PCM format at 27,500 Hz. The Zylog Z80 microprocessor is little endian, so it would make sense for it to read the LSB ROM first and the MSB ROM second, yet the SP12 oddly seems to handle sound in big endian format. Purely academic, as it is of trivial consequence when manipulating the ROMs by other means, as the byte order can be swapped at will.
The LS12 and LS34 ROMs contain two 4-bit nibbles per byte. Each separate nibble (high and low) corresponds to another byte in the MS ROMs. The MSB (8-bit) is matched with the LSB nibble (4-bit) which gives the SP12 its 12-bit PCM sound data, meaning there is a 2:1 ratio between the MSB and LSB ROMs. That is to say, for every two bytes read from an MS ROM, there is only one byte read from the matching LS ROM.
First, the MSB is read in from the selected MS ROM and then the low nibble is read from the corresponding LS ROM. Second, the next MSB is read in from the MS ROM and then the high nibble is read from the same byte in LS ROM — again, this is a 2:1 ratio. The result is two 12-bit words created between the MS and LS ROM with three 8-bit bytes read.
Since wave editors and digital audio file formats customarily use even 8-bit multiples, it is necessary to pad out the remaining 4-bit difference to create the target 16-bit PCM format. Additionally, 16-bit audio is (by convention) signed, yet the SP12 uses unsigned PCM so this also needs to be taken into consideration.
No metadata within the ROM may be overwritten or altered, and each sample must be exactly the same length as the original sound (as measured in terms of sample points) and must occupy the exact same offsets (position) within the full ROM waveform.
Note: This is not strictly true, but changes to the metadata require very special consideration (see ‘METADATA’ section below).
It’s also worth noting that some of the sounds are actually one sound transposed to generate variations (such as the toms) and are not unique in-and-of themselves, so can only be replaced as a full group.
After the desired alterations are made, the custom wave files are then run through an encoding utility to convert back to 12-bit and split the data into the respective ROM binaries. The generated binaries are then burned to blank 23256 EPROMs and substituted for the original sound ROMs.
Note: This data is the result of reverse engineering and is not definitive.
Beyond not taking advantage of the entire SP12 bit-depth, the “8-bit only” technique also neglects the potential of metadata manipulation.
Pitch, decay, sample start, sample length, loop size, instrument name, and output channel can all be altered if done knowingly and properly.
Full details to be published shortly.
The SP12 sound ROMs have been independently verified to have the following 16-bit checksums: MS1 [E918], MS2 [E4F6], MS3 [91FD], MS4 , LS12 [D48C], LS34 [CCC7]