Kind of like cat scan. But not. In any way at all... July 31, 2009 7:56 AM   Subscribe

I'm looking for a site that was posted on MeFi aaaaaages ago about stopping a cat bringing mice in using a webcam to lock the cat flap. Does anyone remember what it was called?

... because I'm really hoping they made the electronics and software open source...

Evil bundles of nature murdering cuteness...
posted by twine42 to MetaFilter-Related at 7:56 AM (28 comments total) 1 user marked this as a favorite

You're probably talking about Flo Control, which I found referenced in an askme thread when I searched on cat webcam.
posted by cortex (staff) at 8:01 AM on July 31, 2009


AskMe thread it was referenced in; another, later askme, and the original metafilter post about it back in 2002.
posted by cortex (staff) at 8:03 AM on July 31, 2009


This, linked from here.

(found via)
posted by DU at 8:03 AM on July 31, 2009


damnit, I should have been able to find that.

Looks like there's no info on how they actually do what they're doing. Looks like I have a use for my upcoming askme.

Cheers cortex.
posted by twine42 at 8:05 AM on July 31, 2009


There's tons of info on how it's done, from hardware through to an English description of the algorithm.
posted by DU at 8:08 AM on July 31, 2009


Most of this is reproducible by a very handy person, except for this part: As mentioned on the theory page, before comparing the images we convert them into records describing discrete features. The skill threshold on this one is very high: your typical garage tinkerer is not going to know how to convert grayscale bitmaps into a set of "features" which describe curves which can be compared in a way that is at least somewhat independent of scale, position and rotation, and to do it 20x/sec.
posted by George_Spiggott at 8:53 AM on July 31, 2009


True, although the way the hardware is set up you probably *could* do a plain pixel-based comparison. The example images show almost no variation, presumably because of the tight physical constraints on where the cat has to be to get through the door.

A 2009 computer (rather than the original 2002) should be able to do that fast enough to close the door. Alternatively, keep the door closed and require the cat to pass the test before opening it. They will train on a little beep or buzz pretty fast.

The Flo Control main page mentions an "upcoming" product that you can email them about. Seems doubtful that ever materialized, but maybe.
posted by DU at 9:04 AM on July 31, 2009


A pixel-based comparison might save you a little math but it just moves the difficulty around without reducing it, and getting a hit rate with the same reliability (to keep you out of the Confuse-A-Cat business) would probably be more difficult, not less. If you carved up a representative silhouette into regions; these must be dark, these must be light, and these, a broad stroke around the cat's profile, may be anywhere between the two levels, you might end up after a lot of trial and error with something with few if any false negatives on your cat and maybe no false positives on cat+mouse, but at best your false positives on skunks and raccoons are going to be high. It's probably worth the one-time investment in working out the vector calculations.
posted by George_Spiggott at 9:20 AM on July 31, 2009


I wasn't thinking of carving it into regions exactly. At least not with the trial and error aspect. Just average 100 or so shots of a mouse-less cat into one integrated image. Then do a simple "number of pixels that match and how strongly" comparison with your test image. The skunk and bird images are going to completely fail for sure.

If there is processor time to spare, maybe try each image rotated through -10/+10°, although there's so little variation in the examples that probably isn't even necessary.
posted by DU at 9:31 AM on July 31, 2009


It occurs to me (if the algorithm is the biggest sticking point precisely as described, which I'm not certain it is) that if you can get your accuracy up to simply "no false negatives on cat alone, no false positives on cat+mouse", then you could deal with the skunk problem by preserving the magnetic collar tag actuator part of the mechanism. For the latch to work, both silhouette match and collar tag must be true. A skunk won't have the collar tag.
posted by George_Spiggott at 9:38 AM on July 31, 2009


DU, a "number of pixels" approach will fail because the system requires only one positive in order to unlock; it has to be this way because a moving cat is going to match the profile by your best estimate in a very small number of frames, probably only one or two at most. You can't refuse to unlock because some number of frames failed to match, you must unlock if a frame did. We have to assume that a skunk banging around in there is going to produce something within the expected range of light vs. dark pixel proportions a lot of the time.
posted by George_Spiggott at 9:44 AM on July 31, 2009


You don't even need that much accuracy. All you need is "no false positives on cat+mouse". If you occasionally decline to pass a mouseless cat, it's no big deal. The cat will try again. (Cat owners, amirite?)
posted by DU at 9:45 AM on July 31, 2009


You can't refuse to unlock because some number of frames failed to match, you must unlock if a frame did. We have to assume that a skunk banging around in there is going to produce something within the expected range of light vs. dark pixel proportions a lot of the time.

Are we looking at the same project? Flo Control already chooses a single frame (or small set of frames) based on when the cat's nose (front edge of the dark area) reaches the center of the image. So there's the "fail frames" problem solved.

As for "banging around", just up the physical constraints a notch or two. The already has to squeeze through a tight spot, which puts it's face right in line with the camera. Make sure the same applies to the skunk.

Better yet, put the RFID door on the outside, *before* the camera. Then the camera just has to find a mouse, not exclude skunks.
posted by DU at 9:49 AM on July 31, 2009


The frames are taken at intervals that can't be synchronized to the cat's location and position, so the cat will be in various locations, darkening an arbitrary but generally increasing number of pixels, during its transit. A cat entering the frame with a mouse in its mouth will darken the number of pixels in your trigger range at some point, and for reasons I gave above, if that's your trigger, you have to unlock. I don't think the problem is solvable (if we consider a significant probability of false positives to be completely defeating the purpose) without factoring in shape.
posted by George_Spiggott at 9:52 AM on July 31, 2009


Didn't preview -- you've spoken to the timing issue, but now I'm out of time. This is fun, though.
posted by George_Spiggott at 9:54 AM on July 31, 2009


The cat's image slides in more or less smoothly from one side due to physical constraints. The first image during which the center of the image is dark is chosen for comparison (and is what all the baseline images are composed of). If that comparison matches, unlock. If not, remain locked. Reset when there's no dark patches (i.e. the cat has withdrawn).
posted by DU at 9:57 AM on July 31, 2009


Oh man, I fucking HATE cats!

Just kidding. I have two.

But I BEAT them ALL the TIME!!

With love. I beat them with kisses and smooches.

SMOOCHES FROM MY BASEBALL BAT!!!

Actually, I don't own a baseball bat.

I don't own two cats either.

They just live in my apartment with me.
posted by jabberjaw at 9:59 AM on July 31, 2009 [4 favorites]


Okay, I have a moment now. DU, let's assume we've dealt with the not-cat-at-all case using our collar tag. Boiled down, you're asserting that (proportion of darkened pixels) is sufficient to distinguish the silhouette of cat-alone from cat-carrying-carrying-arbitrary-dead-thing, reliably under real world conditions whose variability is not unconstrained but is still quite fuzzy in ways we're probably not going to fully explore here. Let's put it this way: I'd reject that design if it were brought to me, not least because we can make no statement at all about the leading edge or light-blocking properties of the dead thing. The extreme case is that of a dead or nearly dead bird in whatever state of being carried. You've seen dead birds, I'm sure; I couldn't begin to formulate a general statement about the number of pixels it would darken no matter how you constrain the timing or placement of the leading edge.

The burden of proof is on the person presenting the design, so you'd have to show that there's a pixel percentage range sufficiently loose to accommodate our cat-alone cases that wouldn't be achieved by cat+nasty an unacceptable proportion of the time.
posted by George_Spiggott at 10:57 AM on July 31, 2009


Boiled down, you're asserting that (proportion of darkened pixels) is sufficient to distinguish the silhouette of cat-alone from cat-carrying-carrying-arbitrary-dead-thing

No, I'm not. It isn't the number of pixels and it isn't the proportion. It's *which ones match*. Basically, AND the training set and the target image together. Sum the result. Large numbers are better matches.

This method lets you compare the outlines on a shape basis without having to extract the shapes or "understand" them.
posted by DU at 5:07 PM on July 31, 2009


What if you have multiple cats of different colours? Or different sizes? What if the mice are different colours and/or sometimes the same colour as the cat? What if one cat likes to catch birds, one likes mice and the third just brings home lizard tails? What if your cat just learns how to jimmy open the window?

Maybe these are just problems in my household and not worth considering ...
posted by shelleycat at 6:53 PM on July 31, 2009


Maybe it's better to put in a microphone and exclude the cat making that proud crowing/wailing noise that says 'hey, look what I caught for dinner!'. I reckon you'd get a higher success rate, in my house at least.
posted by shelleycat at 6:55 PM on July 31, 2009


Oh weird, I was thinking about this a couple of days ago, this exact same problem! I tried to find the site but couldn't, and was trying to figure out how the guy did it. Curious , I was also trying to figure out if it would be better to define the shape of the cat or do something like convert it to a grey scale image and do a statistical confidence test on where the pixels are. I think DU's method might actually be doable if it is what I was thinking, especially if you constrain the confidence area into a small area. The cat is unlikely to enter the small space at a weird angle, or say, on its back. You would only have to really test the area where the mouth is, and on determining it is a cat (a more fuzzy test) determining if the confidence is high enough in the mouth area. I think it would probably be doable and easier to figure out than doing the vector calculations.

I have no idea how you'd go about converting the shape to a vector and then doing a comparison. What's the area of math that is involved in vector shape comparison? This sounds like something a Viennese nerd in the 17th century figured out, I'm sure there's probably a software library that does the heavy lifting for this already.
posted by geoff. at 6:59 PM on July 31, 2009


What if you have multiple cats of different colours? Or different sizes?

They each have a different RFID collar, so you know which training set image to use to compare against.

What if the mice are different colours and/or sometimes the same colour as the cat?

The color is irrelevant, since the match is basically just a silhouette. (I can't believe I just spelled that right on the first try.)

What if one cat likes to catch birds, one likes mice and the third just brings home lizard tails?

Are you planning to let some of these categories in and exclude the others? If not, then it doesn't matter. "Stuff in the cat's mouth" == "leave him out".
posted by DU at 7:11 PM on July 31, 2009


They each have a different RFID collar, so you know which training set image to use to compare against.

OK, that's clever, I wouldn't have thought of that (although it's obvious now someone else has!). It can also take care of the third issue since cats generally have preferred prey, so the dark brown cat is always going to have a lizard, no point looking for blackbirds against that background (lizard tails are the grossest type of prey but are very small and hard to see, so it would actually be the trickiest but most important to find).

But for the colour one the webcam would have to be pretty good, because a black mouse against a black cat really isn't easy to see. Sparrows against a tabby would also be difficult since all you see is a few tail feathers and maybe a foot sticking out of the mouth against a multi-coloured moving background. It might be helped by a motion sensor light to throw contrast, it's normally pretty dark right in by the back door.
posted by shelleycat at 7:22 PM on July 31, 2009


I thought my idea was really simple and obvious, but the amount of blowback and confusion I'm getting indicates it's either worldshaking genius or utter idiocy. Obviously I'm going to have to try it and see which.
posted by DU at 7:24 PM on July 31, 2009


DU -- sorry, I misunderstood you. That seems fairly workable on the face of it. Given the variations in size, horizontal position and angle I'd guess it'd be on the same rough order of accuracy as the "regions" approach. But with either approach you could correct for horizontal offset in a fast, crude way before the comparison began. Perhaps it could reject skunks and raccoons more effectively, it's not easy to say offhand.
posted by George_Spiggott at 7:28 PM on July 31, 2009


Come to think of it, with either approach you could make some pretty good scaling decisions based on an optimistic sampling algorithm prior to the compare as well. That leaves us with only the orientation problem skewing the accuracy: let's assume the main variation is in the axis described by "nose elevation", that is, a pivot around a point on the bitmap. I surmise that there's a straightforward sampling algorithm that would do a good job of guessing how to correct for that as well -- though the amount of processing involved in rotating an image in amounts of less than 90 degrees is of greater magnitude than the comparison algorithms we're talking about. We might be better off with the "regions" approach there, because the handful of vectors involved in our template can be (virtually) "rotated" a lot more cheaply.
posted by George_Spiggott at 7:36 PM on July 31, 2009


I think a lot of the people here deciding that pixel-based approaches are all doomed haven't actually looked at the Flo Control setup. It doesn't use a full color frontal webcam shot as input; it uses a silhouette in profile, which ought to cut the amount of image processing required quite substantially. A bit of translation and rotation combined with a database search for the training image that matches best should allow you to do a quite effective pixel-based approach.
posted by flabdablet at 12:37 AM on August 2, 2009 [1 favorite]


« Older Should MetaTalk be the first choice when trying to...   |   A short time in this place Newer »

You are not logged in, either login or create an account to post comments