SPELLING FAIL
. ok, so I've been spelling SwisTrack wrong all this time. I guess that's why I could never find it when I googled SwissTrack D:
. ok, so I've been spelling SwisTrack wrong all this time. I guess that's why I could never find it when I googled SwissTrack D:
It sucks, but it is going to have to be the case, for now at least. It has taken way too long to get anything useful to actually work on it; only about 3/4 of the opencv functions don't cause any app I make/compile to crash. This was also the case with SwissTrack. After first getting around the whole ffmpeg swscale/img_convert issue and rolling my own opencv-cell libs, SwissTrack still wouldn't grab frames from the USB cam, not to mention it wanted to use one of those opencv functions that crashes on the SPU.
It's now the day before leaving for shmoocon, so I think it's time to move on and get something that works. I'm going to be running SwissTrack on windows on my laptop, since I know for a fact it will work. In fact, using the built in camera, I have already made the tracking pipeline. Now all that Is left is to write the mjpeg input component for swisstrack. LUCKILY THAT IS PROBABLY GOING TO BE AROUND 5 LINES OF CODE ADDED TO A COPY OF THE AVI INPUT COMPONENT!
At lunch today, Steve and I talked a lot about the future of this project, and why said future was becoming so bleak. We realized we needed to come up with some standards upon which the rest of the crew should base their work off, at least for the programming part. Thus I took it upon myself to start a Sensor Comm API, which will basically consist of a data type and a few methods which will make the jobs of the sensor crew much clearer. The idea is to have a structure very similar to a machine code instruction, well, only in that it is a given number of bytes, divided up into sections of bits for different parts. Right now, I have a 16 bit word consisting of 3 bits that define the type of the sensor, 3 bits that define the ID (could probably be reduced, but we have a lot of extra space if we use 16 bits), and 10 bits for data returned by the sensorkiddy. Now all the sensorkiddies have to do is make their own self contained function (that we will most likely call in a separate arbitration section) that will call the sendWord() method when it has something to say. sendWord() takes a SensorWord and sends its two bytes down the serial line, thus clarifying jazzman's job as well.
Now, for all you sensorkiddies, what you need to do is figure out how you want to use your 10 bits of data. You probably won't even need the 10 bits, but it doesn't matter. All you have to do is program the interpretation of the sensor data in C, reduce your interpretation down to 10 bits, put it into your buffer, and call sendWord() on it! Behold, the beauty of abstraction.
There is a more technical wiki post on all this here.
I was looking around last thursday and came across this page, which detailed the use of the wonderful Cell processor to greatly increase the speed of some of the image recognition operations. I spent last weekend getting Gentoo up and running, but found the cell toolchain to hard to deal with and unstable, so I decided to go with Fedora. The CVCell packages were already compiled and distributed in RPM format, so Fedora seemed like the obvious choice for getting this system up and running for development. At this point, I'm looking at the examples provided and finding as many OpenCV tutorials as I can. That's all for now.
We had originally planned to use the BOCA cluster (big old computer array, old being the key word), but after looking deeper into OpenCV, it is apparently pretty taxing on the processor. Solutution? Use the power of the Cell Processor to greatly increase our computing power (for OpenCV at least). Hmmm… where would we find a machine with such a specialized processor sold to common students though? Answer: [Sony Playstation 3]. For preliminary testing, at least, we will be using my brother's PS3 running Gentoo (with a 64bit userland of course). We will also be using a specially formulated version of OpenCV meant for the Cell BE, aptly titled CVCell. Boasting many improved functions designed specifically for the Cell and its 6 friendly little SPE's, we are able to gain an order of magnitude more processing power over a top of the line x86 chip for our application (object recognition/tracking, gesture interpretation, etc), greatly reducing the cost of the computing backend.
So far, the ideal way of handling all the video feeds is to only handle them when there is something in front of the camera besides background. Ideally, we will use input from some of the other sensors to let the router signal the ps3 to start grabbing video only when there is something to see. Assuming sensor error, we will most likely have some kind of fuzzy logic set up so we don't waste bandwidth/processing power on tracking some harmless furry creature.
Continuing on the idea of fuzzy logic, it is my impression that we will eventually be using fuzzy logic or some kind of bayesian probability to combine all of the input we have to formulate a threat level.
So far, I have a test machine for OpenCV set up on my linux box in labstaff, and am currently perfecting the toolchain needed to get CVCell going. I will also need to look into patching OpenCV to be able to grab a stream natively, instead of independantly grabbing each frame and sending to disk, then loading it in with the regular image file loading. As a temporary speedup, I can allocate a small portion of ram to a little filesystem that could be used to cache the image and remove the hard-disk bottleneck that plagues my current solution. This still doesn't solve the lag associted with making 30 separate http connections a second though.