« September 2007 | Main | June 2008 »

October 7, 2007

sumi-chan

tonight saw the successful launch of my virtual multithreading solution, sumi-chan.

first of all, some background: right now, i own three modern computers, all of which are powered by dual core processors. the brilliant among us will have already concluded that this means i have a total of six (6) processor cores available. all the computers share storage over a gigabit lan.

unfortunately, the high quality deinterlacing filter mvbob() is not multithreaded. this means it can only use one core at a time. in the past i have set up multiple instances of vdub with e.g. one fourth of the total segments in a run to each, then started them all at the same time, but of course not all the segments are identical in length, so there would always be inefficiency as some vdubs would finish their deinterlacing before others.

avisynth is a very useful tool, and it was only a matter of time before i used it to automate a process by which any given input could be split up across a given number of threads, then automatically recombined later. i call this process sumi-chan for lack of a more creative descriptor (if the name were to describe the software properly, it would be something like "avisynth filter virtual multithreading system" - so long! so boring!).

usage is simple - i just run sumichan.sh with the avisynth scripts' basename and the desired number of cpu cores across which to split up the work, then run the resulting batch files on each real machine (one batch file for each cpu core). thus, entering 6 for sumi-chan's second argument results in six sets of avisynth scripts and six batch files.

due to rounding errors, if the number of frames in the source material is not divisible by the number of cpu cores specified, a few frames will be "squeezed" to the final work unit. this is unlikely to result in any significant inefficiency, as mvbob() usually processes between 1 and 2 frames per second even in extremely complex scenes. changes in said complexity throughout each work unit will result in much more significant inefficiency (compared to a true multithreading solution whereby all cpu cores would be simultaneously busy or not busy).

sumi-chan will appear as an optional feature in anri-chan 3 (we're getting close to the release of 2 right now). it ought to make encoding d1 material much more realistic for most people.

incidentally, i'm using quantizer 1 vfw xvid (via vdub) for the nmf. i've been extremely pleased with both the cpu and disk costs associated with this codec (they are much smaller than those associated with lagarith - and it crashes less to boot!).

of course, i have to include the obligatory crap quality picture of my new virtual computer, dubbed "v6". cores 1 and 2 are in the macbook on the left, while cores 3, 4, 5 and 6 are monitored on the mac mini's screen, but are in reality split equally between v5 (my pc) and the mini. (the terminal windows lower down on the mini's screen are left over from working on the prime hunters run as discussed in the previous entry.)

Posted by njahnke at 2:56 AM | Comments (0)

October 5, 2007

diy ds

it's been almost eighteen months since i posted the ds recording guidelines, and now the first ds run has been submitted to sda.

DSGamer3002 followed those guidelines to the letter for his prime hunters run, and this was reflected in the quality of the recording.

however, the original recordings needed to be cropped in order to get rid of their useless black borders. the catch was that the cropping parameters needed to be dynamic - that is, where and how much i need to crop changed every few seconds.

as i'm sure you've figured out by now, i came up with a way to accomplish this in a semi-automated fashion, and the rest of this blog entry will be devoted to this methodology.

the first step was to locate a simple autocropping plug-in for avisynth. this came in the form of the aptly-named autocrop, found at warpenterprises. thankfully, autocrop() offers a great deal of customization of the autocropping algorithm. the user can set not only the crop/don't crop threshold, but the number of sample frames the plug-in takes into consideration within the video, as well.

the crop threshold is probably 8-bit luma, while the sample frames are, according to the plug-in's author, taken equidistantly from within the source video. however, no matter how many samples the plug-in uses, there will always be only one set of cropping settings per video. thus, whenever dsgamer's tired hands moved ever so slightly, the screen would no longer be contained entirely within the initially-calculated cropping values.

the obvious solution was to make many more, smaller videos out of dsgamer's run - enter acrop.

this bash script takes two arguments: the basename of the video and audio originals and the number of frames in the original video. the script then creates a number of avisynth scripts, each of them including a number of source declarations and autocrop() invocations for bits of the original video.

i call each section of video on which autocrop() operates a "window". the window size is not arbitrary, but is determined by how long a sequence of absolute black will be on screen in the source video. if any window consists entirely of black, then autocrop() will naïvely produce 0x0 video, which is a contradiction in terms - avisynth has no ability, from what i have seen, to resize a 0x0 frame. thus, even though it's all black anyway, windows consisting entirely of black must be avoided. the window size, then, should be perhaps one or two seconds longer than the longest stretch of absolute black in the source video.

i found taking the correct number of samples, too, to be crucial: the process will fail (producing 0x0 video) if either too many or too few are taken.

for convenience and efficiency of processing, i included several windows in each new script file. for this speed run i created scripts 1000 frames long, 5 200-frame windows in each. because the source video was exactly 30 fps, a 10-minute segment would correspond to around 20 individual script files.

it was not possible to include every script file in one master script for two reasons. one was the extremely long startup time associated with stacking multiple instances of autocrop() on top of one another. on my core 2 duo 2 ghz, it took between 10 and 20 seconds just to open one script with 5 windows in it. the other was that each window represents one directshowsource() declaration for the source video - in other words, there was one copy of the entire source video in memory for each window in the currently open script(s). avisynth is extremely ram-inefficient in this way, but you can't blame it, actually, because we're talking about windoze here.

of course, my default autocrop() threshold of 100 was not always optimal. in fact, i had to preview every window in vdub to make sure it was not too high or too low. many times i simply replaced every 100 with e.g. 70 because the game environment in that particular segment was darker and autocrop() was getting fooled and cropping off parts of the game screen. it was also occasionally necessary to increase the threshold when e.g. the screen went totally white, resulting in light reflecting off surrounding objects and back onto the area around the ds lite's screens, in turn resulting in autocrop() leaving in the plastic border around said screens.

after i determined the correct autocrop() parameters for each window of a segment, i aligned the audio (recorded separately in audacity by dsgamer via an 1/8" miniplug cable) by placing the over-the-air audio from the original source video on the left channel and the correct, line-in recorded audio on the right channel. i knew that the "new" audio was lined up when i could not distinguish that sound was coming first in either the left or the right channel. i adjusted the correct audio's delay by using delayaudio() in a master avisynth script file which loaded all of the individual pieces of the segment in question (since saved from the original scripts as lagarith-compressed new master files).

i would like to emphasize my success with this relatively novel method of audio realignment. in sharp contrast to the usual method when it comes to audio realignment, i did not even have to look at the video to know the audio was aligned - i literally did the work with my eyes closed.

and so the world's first high quality downloadable nintendo ds speed run was prepared for distribution on sda. i hope that everyone enjoys dsgamer's hard work and my automation-laced lack thereof when his run is posted on sda soon.

i almost forgot - here's the obligatory screenshot of my initial autocrop() parameters discovery process.

Posted by njahnke at 4:12 PM | Comments (0)