List Info

Thread: Implement nanosecond timestamps (for version-based recovery)




Implement nanosecond timestamps (for version-based recovery)
user name
2007-01-26 02:14:24
Please don't reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=10651 What |Removed |Added ---------------------------------------------------------------------------- Attachment #9401|review?(adilgerclusterfs.co|review- Flag|m) | (From update of attachment 9401) >Index: linux-2.6.12/fs/ext3/ialloc.c >=================================================================== >--- linux-2.6.12.orig/fs/ext3/ialloc.c >+++ linux-2.6.12/fs/ext3/ialloc.c > -43,7 +43,6 > * the free blocks count in the block. > */ > >- > /* > * Read the inode allocation bitmap for a given block_group, reading > * into the specified slot in the superblock's bitmap cache. Please don't change whitespace in the bug unless needed. > -419,6 +418,8 static int find_group_other(struct super > * For other inodes, search forward from the parent directory's block > * group to find a free inode. > */ >+# define MAX(a, b) (((a) > (b))? (a) : (b)) The kernel already has a "max()" macro. > -591,9 +593,17 got: >+ if (EXT3_HAS_RO_COMPAT_FEATURE(sb, EXT3_FEATURE_RO_COMPAT_EXTRA_ISIZE)) { >+ ei->i_extra_isize = MAX(sb->s_min_extra_isize, >+ MAX(sb->s_want_extra_isize, >+ sizeof(struct ext3_inode) - >+ EXT3_GOOD_OLD_INODE_SIZE)); >+ } >+ else >+ ei->i_extra_isize = >+ (EXT3_INODE_SIZE(inode->i_sb) > EXT3_GOOD_OLD_INODE_SIZE) ? >+ sizeof(struct ext3_inode) - EXT3_GOOD_OLD_INODE_SIZE : 0; Hmm, it seems concievable (though it would be incorrect) to have RO_COMPAT_EXTRA_ISIZE set on a filesystem with inodes of EXT3_GOOD_OLD_INODE_SIZE, and in this case the above would corrupt the filesystem. How about in ext3_fill_super(): /* determine the minimum size of new large inodes, if present */ if (EXT3_INODE_SIZE(inode->i_sb) > EXT3_GOOD_OLD_INODE_SIZE) { EXT3_SB(sb)->s_want_extra_isize = sizeof(struct ext3_inode) - EXT3_GOOD_OLD_INODE_SIZE; if (EXT3_HAS_RO_COMPAT_FEATURE(sb, EXT3_FEATURE_RO_COMPAT_EXTRA_ISIZE)) if (EXT3_SB(sb)->s_want_extra_isize < le32_to_cpu(EXT3_SB(sb)->s_es->s_want_extra_isize)) le32_to_cpu(EXT3_SB(sb)->s_es->s_want_extra_isize); if (EXT3_SB(sb)->s_want_extra_isize < le32_to_cpu(EXT3_SB(sb)->s_es->s_min_extra_isize)) le32_to_cpu(EXT3_SB(sb)->s_es->s_min_extra_isize); } and here in ext3_new_inode() we only need: ei->i_extra_isize = EXT3_SB(sb)->s_want_extra_isize; We _might_ want to change this so that it will take s_want_extra_isize from the on-disk superblock each time, so that if tune2fs is used while the filesystem is in use then it will pick up this new value immediately. We might also want to verify that s_{want,min}_extra_isize is small enough to fit in EXT3_INODE_SIZE() or it may cause data corruption. > -2450,10 +2450,11 void ext3_read_inode(struct inode * inod >+ EXT3_INODE_GET_XTIME(i_ctime, i_ctime_extra, inode, raw_inode); >+ EXT3_INODE_GET_XTIME(i_mtime, i_mtime_extra, inode, raw_inode); >+ EXT3_INODE_GET_XTIME(i_atime, i_atime_extra, inode, raw_inode); >+ EXT3_INODE_GET_XTIME(i_crtime, i_crtime_extra, inode, raw_inode); Tabs instead of spaces in kernel patches, but more on this line below... > -246,7 +246,7 struct ext3_inode { > __le16 i_uid; /* Low 16 bits of Owner Uid */ > __le32 i_size; /* Size in bytes */ > __le32 i_atime; /* Access time */ >- __le32 i_ctime; /* Creation time */ >+ __le32 i_ctime; /* Change time */ Actually, looking in the current e2fsprogs-1.39-cfs2, the comment reads "Inode Change time". > -295,10 +295,52 struct ext3_inode { >+ __le32 i_crtime; /* Creation time */ >+ __le32 i_crtime_extra; /* extra Creation time (nsec << 2 | epoch) */ The comments here read "File Creation time" and "extra File Creation time". >+#define EXT3_FEATURE_RO_COMPAT_NS_TIMESTAMP 0x0008 >+#define EXT3_FEATURE_RO_COMPAT_EXTRA_ISIZE 0x0010 The RO_COMPAT_NS_TIMESTAMP feature was replaced with RO_COMPAT_EXTRA_ISIZE, and the latter has the value 0x0040. >--- linux-2.6.12.orig/include/linux/fs.h >+++ linux-2.6.12/include/linux/fs.h > -440,6 +440,7 struct inode { >+ struct timespec i_crtime; This definitely shouldn't be part of the generic VFS inode. Instead, it should be initialized in ext3_new_inode() and saved in the ext3_inode_info like i_dtime is, read from disk in ext3_read_inode(), and then written to disk in ext3_do_update_inode(). > -804,6 +805,8 struct super_block { > /* Granuality of c/m/atime in ns. > Cannot be worse than a second */ > u32 s_time_gran; >+ u16 s_min_extra_isize; /* All inodes have at least # bytes */ >+ u16 s_want_extra_isize; /* New inodes should reserve # bytes */ > }; Similarly, these fields are ext3-specific and should not go into the generic superblock. Instead they are in the ext3_super_block (which is read from disk) as can be seen in e2fsprogs-1.39-cfs2/lib/ext2fs/ext2_fs.h. That also means they need to be byte-swapped on access, but it likely makes sense to store an s_want_extra_isize in ext3_sb_info (the in-memory filesystem data) so we don't have to do the max() and byte-swap operations for each new inode. _______________________________________________ Lustre-devel mailing list Lustre-develclusterfs.com https://mail.clusterfs.com/mailman/listinfo/lustre-devel
[1]

about | contact  Other archives ( Real Estate discussion Medical topics )