Thursday, April 14, 2011

Subclassing NSInputStream

Cocoa's NSInputStream is great, but sometimes it doesn't have all the functionality you need. For example, you might want to dynamically encrypt the file as you were streaming it off the disk, or you might want to put up a progress bar indicating how far the input stream had progressed through a large file. NSInputStream doesn't support either of these options natively, but this sounds like a great place for an NSInputStream subclass, right?


Or rather, right, with the caveat that subclasses of NSInputStream don't work correctly when used with Cocoa's URL loading mechanism. ( folks: See rdar://problem/3222783.)

The Problem

When you make an NSInputStream subclass and try to pass it to -[NSURLRequest setHTTPBodyStream:], your app will quickly crash with an 'unrecognized selector' for the following method:

- (void) _scheduleInCFRunLoop:forMode:(CFStringRef)inMode

If you implement that, you'll then get another unrecognized selector:

- (BOOL) _setCFClientFlags:(CFOptionFlags)inFlags
                   context:(CFStreamClientContext *)inContext

If you give that one an empty implementation too (return YES;) and the file you're uploading is small, you may manage to make it to a third unimplemented selector:

- (void) _unscheduleFromCFRunLoop:(CFRunLoopRef)inRunLoop forMode:(CFStringRef)inMode

Unfortunately, while you can avoid the unrecognized selector crashes by implementing these methods, you're very unlikely to get through your whole stream. Further, Apple engineers have stated that such naive implementations are "definitely not safe" and will likely lead to failing in "strange and unexpected ways."

The backstory

The real question here is what these methods are for in the first place. It turns out that these three methods are simply the toll-free bridging versions of CFReadStreamScheduleWithRunLoop, CFReadStreamSetClient, and CFReadStreamUnscheduleFromRunLoop, respectively. Calling CFReadStreamScheduleWithRunLoop from _scheduleInCFRunLoop:..., for example, is a quick way to infinite recursion.

The NSStream documentation indicates that subclasses must override -(void)scheduleInRunLoop:forMode: and -(void)removeFromRunLoop:forMode:. Why? Because the stream's delegate usually needs to be notified when there are bytes available to be read, and that getting that information often requires being scheduled on a run loop. Our three mystery methods serve the same purpose.

NSInputStream is toll-free bridged with CFReadStream. Mostly. From the CFReadStream Reference:
CFReadStream is “toll-free bridged” with its Cocoa Foundation counterpart, NSInputStream. This means that the Core Foundation type is interchangeable in function or method calls with the bridged Foundation object. Therefore, in a method where you see an NSInputStream * parameter, you can pass in a CFReadStreamRef, and in a function where you see a CFReadStreamRef parameter, you can pass in an NSInputStream instance. Note, however, that you may have either a delegate or callbacks but not both.
These methods are required to support the CFReadStream client callbacks, which are distinct from the delegate callbacks.

The solution

-[NSInputStream _scheduleInCFRunLoop:forMode:] is the equivalent of CFReadStreamScheduleWithRunLoop for your stream. Do whatever you need to do so that you can give proper kCFStreamEventHasBytesAvailable notifications (and any other notifications requested) at the proper time. That may involve scheduling a timer on the run loop, or if your subclass is just wrapping a vanilla NSInputStream, simply scheduling that stream on the run loop. Implement this method as if you were implementing CFReadStreamScheduleWithRunLoop for your stream

-[NSInputStream _setCFClientFlags:callback:context:] is the equivalent of CFReadStreamSetClient, you need to do a few things. If the context and callback arguments are not NULL, the caller is trying to set up a callback client.
  1. Inspect the flags, and record which notifications are requested. The possible values are listed in the CFStream Event Type Constants documentation.
  2. Copy the pointer to the callback function. You'll need to use it later.
  3. Copy the context. The documentation for CFReadStreamSetClient indicates that the context struct passed by the caller should be copied, and that the caller is not responsible for preserving it. memcpy(&myLocalContextCopy, thePassedContext, sizeof( CFStreamClientContext)) works just fine.
  4. Retain the context->info. The context struct includes a void *info member and a CFAllocatorRetainCallBack retain member. Call the retain function on the info pointer (if the retain function is not nil).
If the context and callback parameters are NULL, then the caller is removing the callback client, and you need to do the following:
  1. Call the release function (from the context you previously copied) on the info pointer in that context.
  2. Remove your copy of the context and callback; they are no longer needed.
Finally, return YES to indicate that the asynchronous scheduling was successful.

-[NSInputStream _unscheduleFromCFRunLoop:forMode] is the equivalent of CFReadStreamUnscheduleFromRunLoop. You should remove anything you scheduled in the _scheduleInCFRunLoop:forMode: method.

Once you've done these, make sure you've implemented the other methods required for subclasses as documented in the NSInputStream and NSStream reference. Now, when you pass off your NSInputStream subclass to NSURLRequest, your stream will be scheduled on the run loop and the client callbacks will be set up. Once you're scheduled on the run loop, you should be able to notify your client when you have bytes available to read, when you've reached the end of the file, and when an error has occurred. Make sure you're passing those along correctly, and everything will be good to go.


I've put together a sample implementation of an NSInputStream subclass that demonstrates what I've presented. HSCountingInputStream is a simple class that wraps an NSInputStream, counting the number of instances of a given character have passed through it. Nearly every call is simply passed through to the underlying NSInputStream; the only real code of interest is in the -read:maxLength: and in the undocumented methods discussed above.

I'll keep an eye on the comments, so if you have any questions, please let me know. I'll probably write up another blog here shortly about how I figured all of this out, for those who are interested.