IRCloggy #git 2006-08-30

Logs Search ←Prev date Next date→ Channels Documentation

Provider of IRC logs since 2005.
WARNING: As Freenode became unjoinable and lost all warnings in topics, we cannot log channels on Freenode anymore.

2006-08-30

agile joined00:00
GyrosGeier joined00:04
kanru joined00:16
krh joined00:26
krh joined00:35
dwmw2_LHRdwmw2_SFO00:38
xjjk joined00:58
kanru joined01:30
krh joined01:44
robfitz joined01:46
spearce Hmm. so compressed tree deltas apparently waste space in pack files...02:01
mugwump that sounds familiar :)02:04
spearce has it been discussed before on the list?02:06
mugwump 04:07 < spearce> i'm trying to cheat by not reloading the tree object from disk and instead reproduce it from what i have in memory about its entries/modes/sha1s.02:07
spearce oh, yea. that was because i would be holding two copies of its filename in memory. with many files that's a lot of data. but this compressed tree delta thing is totally unrelated.02:08
mugwump I guess it would make sense to customise the exact compression style per-object02:08
er, it would make sense that such an endeavour might have wins, I mean02:09
eg, you might want to partition delta compressing between different object types, and make the delta scanning algorithms apply tree-specific shortcuts to arrive at a better overall result02:10
spearce or just not compress tree deltas. :) but yes, and i think that's something i'm going to start working on...02:11
Jon and I are starting to become convinced that we can likely get good searching in files for real cheap at the same time.02:12
mugwump or tune the delta compression for trees such that they rarely get compressed...02:12
mugwump reads spearce's list post about mozilla .git02:14
somegeek joined02:14
mugwump oh, I'm confusing delta compression and zlib compression02:14
I didn't think zlib ever actually increased the size of stuff...02:15
I mean, once you've stripped the 20byte header02:15
spearce frell. I didn't account for the 20 byte header in my computation. Arrgh.02:17
mugwump still, that's only 20MB02:18
not enough for the 29MB expansion you found02:20
spearce The other 9 MB could be caused by the object headers.02:20
iirc the header can't exceed 5 bytes, but all odds are tree delta headers are around 3 bytes/each.02:21
I just corrected myself; its more like 8.6 MiB being wasted.02:25
As the headers should be on average 2 bytes in length + 20 byte SHA1 base * 976712 objects.02:26
mugwump hmm, yapc::eu starts in 6 hours, better get some rest :)02:28
spearce 'night.02:28
kanru joined02:40
Tv joined03:31
james_d joined03:56
james_d left04:03
xjjk joined04:15
Gitzilla joined04:16
james_d joined04:28
james_d left04:47
dst_ joined06:05
somegeek joined06:09
somegeek joined07:00
ruskie joined07:05
ferdy joined07:05
somegeek joined07:28
ruskie joined07:28
somegeek joined08:05
somegeek joined08:18
thi joined08:31
thi hello in line 1580 to 1592 of gitweb.perl must be a bug. the if condition is always true08:33
devogon joined08:39
polyonymous joined09:18
mchehab joined10:00
mchehab joined10:14
somegeek joined10:37
coywolf joined10:53
ilogger2 joined11:07
mchehab joined12:01
ruskie joined12:28
timlarson_ joined12:37
ferdy joined13:07
agile joined13:58
spearce joined14:08
segher_ joined14:11
benlau joined14:27
ferdy joined14:28
cworth joined14:38
Oejet joined14:52
xjjk joined15:00
polyonymous joined15:07
krh joined15:47
pdmef joined15:47
yashi joined15:56
GyrosGeier joined16:04
philips I am trying to use git-apply on this http://ifup.org/~philips/kernbench.diff but it is complaining: fatal: corrupt patch at line 1516:13
I tried using git-applymbox earlier today and it complained also16:13
git version 1.4.1.116:13
patch -p0 < /home/philips/kernbench.diff works just fine...16:14
Oejet I guess, it is because, it is not a "git patch", so you should use normal patch, like that.16:15
tnks left16:15
spearce Right. I don't think git apply likes a non-git patch.16:16
philips ah ok16:16
sucky16:16
spearce line 15 as it happens is the blank line between the two hunks.16:16
git patches never have blank lines between hunks, unless its part of the context of the preceeding hunk.16:17
philips On the mbox's I was trying to apply it was failing on the first newline16:17
no matter if it was between hunks or not16:18
Does anyone else have troubles with git-applymbox or git-apply?16:26
spearce they work fine for me with git generated patches; if its not a git generated patch I use GNU patch. so no. :)16:26
philips It doesn't work for me in any case in the guys in #git are trying to convince me that the tools only work with patches generated by git16:26
err, haha, wrong channel16:26
:-D16:27
spearce git apply is incredibly picky about what it gets and refuses to perform a lot of things that patch would normally do just fine.16:28
philips ohhh right, because it is using internal libpatch or something.16:28
spearce actually I think Linus hand-wrote his own patch apply routines in git-apply.16:29
alley_cat joined16:39
anholt_ joined16:51
robin joined17:15
Tv joined17:52
timlarson_ joined18:27
timlarson_ joined18:28
DrNick joined18:30
beubeu_18:53
beu joined18:56
gittus joined18:59
gittus philips: http://ifup.org/~philips/kernbench.diff is indeed corrupt, and git-apply is correct in complaining about it.19:00
That "empty line" on line 15 is very much corruption: according to the diff headers, it is still part of the patch,19:00
but it is not a valid diff-line (it lacks the initial space)19:01
So git is correct (as usual), and you have corrupted your patch either by editing it, copy-and-pasting it, or by having a totally broken "diff" binary.19:01
Beber` joined19:02
gittus The fact is, git is perfectly happy to apply patches generated by other systems, but git refuses to apply patches that have technical problems.19:02
GNU patch (and probably some other patch applicator programs) applies any random crap it can.19:03
Anyway, just thought I'd set the record straight. "git-apply" _is_ very anal, but that's very much on purpose.19:04
dwmw2_gone joined19:04
gittus Unlike apparently a lot of other programs, git cares /deeply/ about data integrity at all levels.19:04
beu joined19:05
gittus "Guessing" what the correct thing should be is simply not acceptable.19:05
dwmw2_gonedwmw219:08
Trigger7 joined19:12
spearce Yea but guessing is damn convienent sometimes. Like when the patch is corrupt. :)19:13
gittus spearce: sure. Guessing is convenient. But /not/ guessing means that you know that you can trust the end result.19:14
I'll take "trust it" over "convenience" any day when it comes to git.19:14
Besides, it has side effects too. In particular, GNU patch has allowed people to perpetuate corruptions, because things still "work"19:15
So you can have bad email clients or cut-and-paste crap that removes spaces at the end of lines, and nobody necessarily even notices.19:15
In contrast, when you're strict, and notice, maybe it's painful right then and there, but you can /fix/ the problem.19:15
spearce I don't disagree. But it is nice when you can still save the patch without editing it by applying it, cleaning up any rejects if any, and test the hell out of the result.19:15
gittus And that means that not only is git-apply then more trustworthy, so is your email client.19:16
spearce Its like if Git made it impossible to read a partially corrupt pack... you'd lose the entire pack if you couldn't read it because of a single bit error in one object.19:16
gittus but that's exactly what you _want_19:17
You may have a git-fsck-objects or something that tries to fix things up, or a "git-extract-as-much-as-possible", or a "--unsafe" flag or similar.19:17
But you should _refuse_ to touch anything that you notice is corrupt. Which is exactly what git does.19:17
In fact, the biggest problem with the recent bit corruption was not that git didn't refuse to touch it (it did, eventually),19:18
spearce I completely agree. But right now git-apply doesn't have a --unsafe flag. :)19:18
gittus but that it could have noticed much eariler.19:18
Actually, git-apply has /several/ "--unsafe" flags, but for _other_ kinds of corruption.19:18
See "-pNUM" and "--whitespace=strip" and friends.19:19
Now, if somebody wants to add a "--accept-crap" flag, please send the patch to Junio, but make sure it's not on by default.19:19
Personally, I want to _know_ about it when somebody sends me crap, so that I can try to make sure that it gets fixed.19:20
spearce Needing -pNUM isn't corruption, its a user who didn't generate the patch correctly for how Git would prefer to apply it. --whitespace=* is looking for weird whitespace corruption but not always, e.g. if you are working with DOS formatted files that whitespace stuff doesn't like the CRs on the end of lines.19:20
gittus No, "-pNUM" _is_ about corruption. It means that git-apply will accept a patch even if it has "fuzz".19:21
That's strictly speaking a corrupt patch that doesn't apply any more.19:21
spearce ah, i though it was the leading directory stripping thing like patch's -pNUM flag.19:21
gittus Oh, damn, you're right. The fuzz thing is -C<num>. Sorry, my bad.19:22
spearce Fuzz is corruption yes, assuming you found a reasonable ancestor to apply the damn patch to. But iirc git-am doesn't search as hard as it could for a possible base to apply that patch to, especially if its not a Git generated patch which lacks blob sha1's. Fuzz may be necessary to get the damn patch to apply to whatever git-am thinks is the right base. In which case its git-am inducing the corruption, not the patch itself... which is still corrupt19:23
Eludias joined19:25
GyrosGeier hm19:25
is there a tool these days that I can tell "apply this patch to the most recent thing you can find where it applies cleanly, and form a new branch from that?" (optionally specifying the branch it must be on)19:26
spearce git-checkout -b tmp; git-am patch ?19:27
git-am is the best guesser we have but it doesn't automatically form the branch and it doesn't guess as well as it could (of course more guessing would take longer, but its already pretty darn fast, so...)19:28
gittus spearce: I think GG was talking about finding that most recent point automatically (ie walking backwards until it applies cleanly)19:28
GyrosGeier exactly19:28
spearce If you read git-am I think it tries that, but it only tries the 5 most recent labels or something...19:29
GyrosGeier I have a bunch of old Linux trees that contain changes I once made for some piece of hardware19:30
spearce Hmm, that code was removed from git-am. Sorry.19:30
GyrosGeier I think I could turn most of them into patches, and there are repos of old trees19:31
spearce Probably because it didn't work as good as it could have.19:31
gittus GyrosGeier: it would be pretty expensive with old-fashioned diffs to do anything but a small number of tests, but it might not be _too_ bad if you had a real git diff with the SHA1 names listed explicitly, and then did a script that basically did "git-ls-tree" on the commits (with the appropriate files listed) to find a tree that matches in all files19:31
GyrosGeier gittus, well, I'd try one file, if it matched, the next one and so on.19:31
gittus, if a test fails, we can rule out that SHA1 for this patch19:32
*path19:32
gittus Sure, that would also work.19:32
It would probably be pretty useful even _without_ actually applying the patch itself, ie I can imagine that you'd want to ask just "what version does this patch apply cleanly to" without even applying it.19:33
GyrosGeier yep19:33
do we have a tool to branch and apply in one step, to complement that?19:33
gittus But it would be more useful with just regular patches too, and that's much more involved.19:33
GG: nope.19:33
GyrosGeier gittus, I'm talking about regular patches19:33
gittus With regular patches, you want to be smarter, but it's still possible to be efficient. What you'd do is:19:34
- try to apply it in the top-of-tree19:34
GyrosGeier gittus, a patch is a list of before/after chunks19:34
gittus - if it doesn't work, just do a "git-rev-list HEAD -- <list-of-all-files>" to get only the list of commits that actualyl _changed_ any of those files19:35
- then try each of those in turn.19:35
That should be reasonably efficient (ie you'd not have to do a _lot_ of unnecessary patch application, and you'd let git do the SHA1-based optimization for when files don't change)19:35
If you teach "git-apply" to take the source from a "git revision + pathname", you'd be able to test whether a patch applies or not without actually ever having to check everything out, too.19:37
A small matter of programming. "cat patch | git-apply --test <revision>" and then check the return value.19:37
Something like that.19:37
spearce Doesn't git-apply actually work off the index right now, so you don't even need to have it checked out to the working directory, just loaded into a temporary index.19:37
GyrosGeier gittus, I'd use the "before" chunks to find candidate blobs for the first file, then look at the list of commits that contain one of these at this path, then repeat for the next file in the patch, starting on the latest commit where file 1 matched19:38
gittus Not really. "git-apply --index" _verifies_ and _updates_ the index, but it will still get the actual _data_ from the currently checked out file.19:38
spearce: plus, you'd not actually want to even read things into the index. "git-read-tree" is fairly expensive, because it (by definition) needs to recursively read _everything_. If you'd just read the data directly only for th efiles you want, that would be a lot more efficient.19:39
GyrosGeier gittus, that requires patch splitting code, but I'd expect it to be roughly O(n*log n) in the number of files touched by the patch19:39
gittus GG: sure. Whatever works.19:40
I'd suggest using git-apply to help you, though. It already does almost all of the work, the only thing you'd need to extend it with is that "try to apply to a specific tree" thing.19:41
Once you have that, git-apply would just tell you exactly which files do _not_ match, and you can use any heurstic you want to find the next commit.19:41
However, I really do believe that you'd be better off using _all_ pathnames (from _all_ current heads), because if you use one path at a time, and it _does_ apply in that path, what will you then use as the set of starting points for the other paths (that _don't_ apply)?19:42
Using just the last "this worked for file<n-1>" commit would be wrong, because that might be in a branch off the mainline (or off the series that the patch _ever_ applies to).19:43
GyrosGeier gittus, of course19:43
gittus See? You need to follow the _full_ history here, for _all_ paths.19:43
robin joined19:44
gittus But git really does have fairly good support for this. "git-rev-list --all -- <set-of-filenames>" really should give you exactly what you want (except you do want to check the current head first).19:44
dwmw2dwmw2_lunch19:44
GyrosGeier gittus, essentially I have a list of blobs where patch 1 applies (I don't need to realize the list in memory), for each I have a list of commits where it is used at this path19:45
gittus, then I can do a depth search19:45
gittus, optionally reordering the patch files to get smaller sets19:46
gittus GG: I'm sure you can make it work somehow. But you need to then use the _full_ list of commits, because some commits will change fileA but not fileB, and when you do the final "which commits support _all_ those conditions" decision, you'll otherwise be in trouble.19:48
So I'm still claiming that you're better off using "git-rev-list -- <filelist>" to get the list of interesting commits _first_. And once you do that, you might as well just use that list to test (one commit at a time)19:48
But hey, the proof is in the pudding. I'm not actually going to write this, and if you will, you can do it any which way you want19:49
"He who writes the code gets the final say"19:49
Anyway, I'm off. I only piped up because I happened to look through the logs again.19:49
Have fun19:49
ruskie joined19:53
somegeek joined20:01
kanru joined20:20
Gitzilla left20:21
Gitzilla joined20:21
anholt_ joined20:37
dwmw2_lunchdwmw221:03
somegeek joined21:55
mchehab joined22:12
mchehab left22:14
Oejet joined22:21
krh joined22:34
Oejet left22:57
agile joined22:58
boto joined23:15

Logs Search ←Prev date Next date→ Channels Documentation