Log in

Quick and dirty progress tracking

Today I was running some analysis on about 9,000 files, basically mapping a function over each to warm up a cache. Something like this:

* (map nil 'analyze-file *9000-files*)
time passes

I had no idea how well it was progressing, and whether I'd need to take a snack break or let it run overnight. So I interrupted it and wrote a quick and dirty REPL utility function:

(defun :dot-every (count fun)
  (let ((i 0))
    (lambda (&rest args)
      (when (< count (incf i))
        (setf i 0)
        (write-char #\. *trace-output*)
        (force-output *trace-output*))
      (apply fun args))))

It prints out one dot per COUNT invocations of the function it returns, giving some indication of progress. Sensible values for COUNT depend on the volume of function calls.

For this problem, I called it with a COUNT of 100:

* (map nil (:dot-every 100 'analyze-file) *9000-files*)

The cached analyses printed out a ton of dots quickly, and the uncached analyses started printing dots at a slow but steady pace, and I could tell that it would be done in a few minutes instead of a few hours.

So now I'm going to use this to wrap up any function I have to call a ton of times and I want to get a sense of how it's progressing.



I'm thinking it might be convenient to not have to figure out the appropriate value for count.

(defun progress-every (seconds fun)
  (let ((calls 0)
        (interval (* seconds internal-time-units-per-second))
        (last (get-internal-real-time)))
    (lambda (&rest args)
      (when (> (- (get-internal-real-time) last) interval)
        (setf last (get-internal-real-time))
        (format *trace-output* "[~D]" calls)
        (force-output *trace-output*))
      (incf calls)
      (apply fun args))))
A thought I've had is that an ideal interactive environment (shells as well as repls) would let you do this sort of thing without needing to abort the process. (Shell analogue: move a running process into a screen session.)

For example, in this case the interactive evaluator could (at least, if you had used #' instead of ') automatically insert a wrapper function around analyze-file, which is the identity until such time as you tell it to turn into a dot-every.

Actually, CL has a somewhat related facility already present, namely TRACE. If I recall correctly, interrupting and doing (trace analyze-file) may or may not have the effect of tracing the calls. But, of course, there's no standard way to take TRACE's behavior and turn it into infrequent dots. But my above remark is about the general notion: an interactive environment “should” preserve the opportunity to change what in principle can be changed if you bothered to keep a reference to it.
Well, I just had to stop and blink at the keyword function definition. Do you make a habit of this?
I have a bunch of utility functions, used only in the REPL, that I name with a keyword. Some people prefer a package with a very short name. I prefer the shortest package prefix possible.

April 2015

Powered by LiveJournal.com