inotify is a great tool for monitoring changes to a directory. You specify the types of events you watch and read a file descriptor, which blocks until something interesting happens. And there are Haskell bindings in the form of hinotify, which replaces this low-level approach with one where you specify a callback for the various event types. In Haskell terms, hinotify has an initINotify :: IO INotify that gives you a value that refers to a specific notifier, and addWatch :: INotify -> [EventVariety] -> FilePath -> (Event -> IO ()) -> IO WatchDescriptor that lets you specify what types of events you want, what path to watch them on, and what to do when you get an event.
It works great, but there’s only one problem: it doesn’t nest. If I have foo/bar/baz, and I put a watch on foo, it doesn’t trigger when I touch baz. I assume this is a performance vs. power tradeoff; I’m not a Linux kernel developer by any stretch so I won’t say whether that’s the right or wrong decision to make here. But it does mean that, for example, if I want to be notified in Milagos when the user edits a post so I can reload it, I have to do interesting things. In particular, I can’t just watch each subdirectory of posts/, because that won’t handle new ones properly. So here’s what I wound up doing. First, let’s build the low-level watcher that reloads the post databse when a directory changes:
changeEvents = [Create, Delete, DeleteSelf, Modify, MoveIn, MoveOut, MoveSelf]
watchPost :: INotify -> FilePath -> IO ()
watchPost i dir =
void $ addWatch i changeEvents ("posts" </> dir) (reload dir)
reload :: FilePath -> Event -> IO
reload dir ev = do
putStrLn $ "Caught an event " ++ show ev ++ ", in " ++ dir ++ ", reloading"
runPool dbconf reloadDB pool
So we have a list of events that we care about, and a watchPost that takes a post’s directory name and registers a watcher for it; when the watcher triggers, it prints out the event and reloads the database. (The dbconf and pool stuff is specific to Yesod, and not important). The void :: (Functor f) => f a -> f () at the beginning is because we’ll pass this to another watcher, and the callback in addWatch must have type Event -> IO (), but addWatch when fully applied gives back an IO WatchDescriptor.
Now that we’ve gotten that out of the way, we can actually get down to the process of setting up watchers. First, grab an INotify:
inotify <- initINotify
Next, watch posts itself so we can set up new watchers:
void . addWatch inotify [Create] "posts" $
\ev -> when (isDirectory ev) $ watchPost inotify (filePath ev) >> runPool dbconf reloadDB pool
Here we only watch creation events; inotify kills off watchers for deleted files. If the created object is a directory watch it (EDIT: and reload the database in case files were created while we weren’t watching it, thanks lpsmith!), else do nothing.
Finally, we need to actually perform that on each existing directory:
postDirectories <- liftIO . getDirectoryContents $ "posts"
mapM_ (watchPost inotify) $ filter (`notElem` [".", ".."]) postDirectories
We get the list of post directories, filter out . and .., and set up watches for each of them.
All together, this is the code for watchPosts:
watchPosts dbconf pool = do
inotify <- initINotify
void . addWatch inotify [Create] "posts" $
\ev -> when (isDirectory ev) $ watchPost inotify (filePath ev) >> runPool dbconf reloadDB pool
postDirectories <- liftIO . getDirectoryContents $ "posts"
mapM_ (watchPost inotify) $ filter (`notElem` [".", ".."]) postDirectories
where
changeEvents = [Create, Delete, DeleteSelf, Modify, MoveIn, MoveOut, MoveSelf]
watchPost :: INotify -> FilePath -> IO ()
watchPost i dir =
void $ addWatch i changeEvents ("posts" </> dir) (reload dir)
reload :: FilePath -> Event -> IO
reload dir ev = do
putStrLn $ "Caught an event " ++ show ev ++ ", in " ++ dir ++ ", reloading"
runPool dbconf reloadDB pool
And people say Haskell can’t do systems programming!
Now, this obviously doesn’t work for arbitrarily deep files. You could certainly extend this technique of using watchers to create other watchers and using the fact that files that get deleted have their watchers die to trigger on updates to anything arbitrarily deep in the filesystem. But it’s a fairly trivial modification; just change whatever your version of watchPost is to also recursively watch its subdirectories.
Also, make sure that if you trigger something like recompilation on a change to your source tree, you don’t trigger on writes to any auxiliary files such as object files that are created in the same directory that you’re watching! You could also modify the recursive watch technique to kill the watcher, reload your thing from disk, and then restart it, but you’ll lose notifications that happened while your watcher wasn’t running. The failsafe approach is to ignore events that happen while your process is running, but then you have to track whether your process is running. There’s no good approach for this case, which is why I suspect it’s one that you should configure your tool to avoid.