@@ -28,6 +28,7 @@ Cache on Already Mounted Filesystem
2828
2929 (*) Debugging.
3030
31+ (*) On-demand Read.
3132
3233
3334 Overview
@@ -482,3 +483,180 @@ the control file. For example::
482483 echo $((1|4|8)) >/sys/module/cachefiles/parameters/debug
483484
484485will turn on all function entry debugging.
486+
487+
488+ On-demand Read
489+ ==============
490+
491+ When working in its original mode, CacheFiles serves as a local cache for a
492+ remote networking fs - while in on-demand read mode, CacheFiles can boost the
493+ scenario where on-demand read semantics are needed, e.g. container image
494+ distribution.
495+
496+ The essential difference between these two modes is seen when a cache miss
497+ occurs: In the original mode, the netfs will fetch the data from the remote
498+ server and then write it to the cache file; in on-demand read mode, fetching
499+ the data and writing it into the cache is delegated to a user daemon.
500+
501+ ``CONFIG_CACHEFILES_ONDEMAND `` should be enabled to support on-demand read mode.
502+
503+
504+ Protocol Communication
505+ ----------------------
506+
507+ The on-demand read mode uses a simple protocol for communication between kernel
508+ and user daemon. The protocol can be modeled as::
509+
510+ kernel --[request]--> user daemon --[reply]--> kernel
511+
512+ CacheFiles will send requests to the user daemon when needed. The user daemon
513+ should poll the devnode ('/dev/cachefiles') to check if there's a pending
514+ request to be processed. A POLLIN event will be returned when there's a pending
515+ request.
516+
517+ The user daemon then reads the devnode to fetch a request to process. It should
518+ be noted that each read only gets one request. When it has finished processing
519+ the request, the user daemon should write the reply to the devnode.
520+
521+ Each request starts with a message header of the form::
522+
523+ struct cachefiles_msg {
524+ __u32 msg_id;
525+ __u32 opcode;
526+ __u32 len;
527+ __u32 object_id;
528+ __u8 data[];
529+ };
530+
531+ where:
532+
533+ * ``msg_id `` is a unique ID identifying this request among all pending
534+ requests.
535+
536+ * ``opcode `` indicates the type of this request.
537+
538+ * ``object_id `` is a unique ID identifying the cache file operated on.
539+
540+ * ``data `` indicates the payload of this request.
541+
542+ * ``len `` indicates the whole length of this request, including the
543+ header and following type-specific payload.
544+
545+
546+ Turning on On-demand Mode
547+ -------------------------
548+
549+ An optional parameter becomes available to the "bind" command::
550+
551+ bind [ondemand]
552+
553+ When the "bind" command is given no argument, it defaults to the original mode.
554+ When it is given the "ondemand" argument, i.e. "bind ondemand", on-demand read
555+ mode will be enabled.
556+
557+
558+ The OPEN Request
559+ ----------------
560+
561+ When the netfs opens a cache file for the first time, a request with the
562+ CACHEFILES_OP_OPEN opcode, a.k.a an OPEN request will be sent to the user
563+ daemon. The payload format is of the form::
564+
565+ struct cachefiles_open {
566+ __u32 volume_key_size;
567+ __u32 cookie_key_size;
568+ __u32 fd;
569+ __u32 flags;
570+ __u8 data[];
571+ };
572+
573+ where:
574+
575+ * ``data `` contains the volume_key followed directly by the cookie_key.
576+ The volume key is a NUL-terminated string; the cookie key is binary
577+ data.
578+
579+ * ``volume_key_size `` indicates the size of the volume key in bytes.
580+
581+ * ``cookie_key_size `` indicates the size of the cookie key in bytes.
582+
583+ * ``fd `` indicates an anonymous fd referring to the cache file, through
584+ which the user daemon can perform write/llseek file operations on the
585+ cache file.
586+
587+
588+ The user daemon can use the given (volume_key, cookie_key) pair to distinguish
589+ the requested cache file. With the given anonymous fd, the user daemon can
590+ fetch the data and write it to the cache file in the background, even when
591+ kernel has not triggered a cache miss yet.
592+
593+ Be noted that each cache file has a unique object_id, while it may have multiple
594+ anonymous fds. The user daemon may duplicate anonymous fds from the initial
595+ anonymous fd indicated by the @fd field through dup(). Thus each object_id can
596+ be mapped to multiple anonymous fds, while the usr daemon itself needs to
597+ maintain the mapping.
598+
599+ When implementing a user daemon, please be careful of RLIMIT_NOFILE,
600+ ``/proc/sys/fs/nr_open `` and ``/proc/sys/fs/file-max ``. Typically these needn't
601+ be huge since they're related to the number of open device blobs rather than
602+ open files of each individual filesystem.
603+
604+ The user daemon should reply the OPEN request by issuing a "copen" (complete
605+ open) command on the devnode::
606+
607+ copen <msg_id>,<cache_size>
608+
609+ where:
610+
611+ * ``msg_id `` must match the msg_id field of the OPEN request.
612+
613+ * When >= 0, ``cache_size `` indicates the size of the cache file;
614+ when < 0, ``cache_size `` indicates any error code encountered by the
615+ user daemon.
616+
617+
618+ The CLOSE Request
619+ -----------------
620+
621+ When a cookie withdrawn, a CLOSE request (opcode CACHEFILES_OP_CLOSE) will be
622+ sent to the user daemon. This tells the user daemon to close all anonymous fds
623+ associated with the given object_id. The CLOSE request has no extra payload,
624+ and shouldn't be replied.
625+
626+
627+ The READ Request
628+ ----------------
629+
630+ When a cache miss is encountered in on-demand read mode, CacheFiles will send a
631+ READ request (opcode CACHEFILES_OP_READ) to the user daemon. This tells the user
632+ daemon to fetch the contents of the requested file range. The payload is of the
633+ form::
634+
635+ struct cachefiles_read {
636+ __u64 off;
637+ __u64 len;
638+ };
639+
640+ where:
641+
642+ * ``off `` indicates the starting offset of the requested file range.
643+
644+ * ``len `` indicates the length of the requested file range.
645+
646+
647+ When it receives a READ request, the user daemon should fetch the requested data
648+ and write it to the cache file identified by object_id.
649+
650+ When it has finished processing the READ request, the user daemon should reply
651+ by using the CACHEFILES_IOC_READ_COMPLETE ioctl on one of the anonymous fds
652+ associated with the object_id given in the READ request. The ioctl is of the
653+ form::
654+
655+ ioctl(fd, CACHEFILES_IOC_READ_COMPLETE, msg_id);
656+
657+ where:
658+
659+ * ``fd `` is one of the anonymous fds associated with the object_id
660+ given.
661+
662+ * ``msg_id `` must match the msg_id field of the READ request.
0 commit comments