Git branch in your bash prompt
Kevin Barnes has posted his mechanism for getting the current branch of the git repository into his bash prompt.
He mentions color in his article, and it turns out I'm the person who added the color, so I figured I'd post my version.
Here are the functions I have:
git_current_branch()
{
git branch 2>/dev/null | sed -n '/^\*/ s/^\* //p'
}
git_display()
{
br=$(git_current_branch)
if [ -n br ]; then
echo $br | BRANCH="$br" GIT_COLOR=$(git_color) awk '{if ($1) { print ENVIRON["GIT_COLOR"] ENVIRON["BRANCH"] " " } }'
fi
}
git_color()
{
git status 2>/dev/null | grep -c : | awk '{if ($1 > 0) { print ENVIRON["ORANGE"] } else { print ENVIRON["PINK"] } }'
}
And then here's my prompt:
title="033]0;h:W007" PS1="$title[$(git_display)$GREENw$NOCOLOR]nu@h("'$?'") $ "
First, the git bits. As Kevin mentions, I color the branch name; I use orange if I've got modified files, and pink if I don't (these are names that I map elsewhere to terminal codes). The three functions provide the three, um, functions: See what branch I'm on, see whether there are uncommitted files, and colorize the branch name.
Now, the bash bits.
First, you'll notice I have a multi-line prompt. I first started this when I switched to a color prompt, because for a while there bash didn't like the hidden characters that add color. I got to choose between a multi-line prompt, or a prompt that wrapped in broken ways. Since I really only wanted color in the path, I put that on the upper line, avoiding most of the wrapping problems. It was worth it, because (especially when I was still a sysadmin by trade) having color, almost any color, in the prompt makes it easy to pick out my commands from command output.
Now that I've had my prompt this way for about 4 years, I'm pretty fond of it. I think the bash wrapping problems are mostly fixed now, but I'm keepin git.
The $title sets the terminal title (which is slightly useful).
So, that's how I add the git branch, colored by whether I have uncommitted files, to my prompt, with a bit more prompt info thrown in for kicks.
Mon, 05 May 2008 | Tags: tools, git, bash
ralsh is Awesome
So, I'm testing ticket #1099, and I run this snippet of code:
user { testing: ensure => present, home => "/var/tmp" }
Then this one:
user { testing: ensure => present, home => "/tmp" }
Sure enough, the home directory changes (although it doesn't actually move the directory, thankfully), so I clearly didn't do due diligence on accepting the bug information from my client. Now I need to remove the user. Sure, I can modify and reexecute the file, but why should I, when I can just do this:
luke@culain(0) $ sudo puppet/bin/ralsh user testing ensure=absent
notice: /User[testing]/ensure: removed
user { 'testing':
uid => 'absent',
home => 'absent',
password => 'absent',
gid => 'absent',
groups => 'absent',
comment => 'absent',
ensure => 'absent',
shell => 'absent'
}
luke@culain(0) $
The extra output of the user is kinda silly, and really only matters when printing rather than modifying users, but still, I use this all the time, and I'm quite fond of it and its silly name.
Tue, 26 Feb 2008 | Tags: tools, ralsh, puppet
Closing the first DTrace loop
I ended up spending what time I had for a couple of days at LCA optimizing Puppet's lexer. As I mentioned in my first DTrace post, I'm mostly just trying to use existing scripts for now and worry about understanding the darn thing later.
It turns out that the Swiss Army Knife of available scripts for Ruby is the rb_calltime.d script in the DTraceToolkit. This one gives us pretty much everything we might want to know in a first pass of debugging a Ruby program, including (most importantly for me) the inclusive and exclusive elapsed times for each method in the system (DTrace insists upon calling them functions, but it's Ruby, so we know better).
Unfortunately, I'd already hacked out most of the optimization before I discovered this script, and it doesn't seem to want to run at the moment, so all I can do is show the current data. Here are the methods that take the most time inclusively (meaning that it counts the time from entry to exit, and thus just tells us where the time is being spent in the overall program, not where the actual problems might lie):
lexer.rb func Puppet::Parser::Lexer::munge_token 12062622 lexer.rb func Puppet::Parser::Lexer::TokenList::lookup 12410291 branch.rb func Puppet::Parser::AST::Branch::initialize 19084001 methodhelper.rb func Hash::each 19144097 methodhelper.rb func Object::set_options 20712067 ast.rb func Puppet::Parser::AST::initialize 22431107 lexer.rb func Array::each 23868735 parser_support.rb func Class::new 28520157 lexer.rb func Puppet::Parser::Lexer::find_string_token 29610463 lexer.rb func Puppet::Parser::Lexer::find_regex_token 29844180 parser_support.rb func Puppet::Parser::Parser::ast 36132924 lexer.rb func Puppet::Parser::Lexer::find_token 44581509 lexer.rb func Object::catch 46187059
And here are the exclusive times (i.e., only counting the time spent in each method, not the time between entry and exit):
ast.rb func Puppet::Parser::AST::initialize 1719039 lexer.rb func Puppet::Parser::Lexer::Token::convert 1920189 branch.rb func Puppet::Parser::AST::Branch::initialize 2062436 lexer.rb func Object::catch 2262789 lexer.rb func StringScanner::match? 2471109 parser_support.rb func Class::new 3112051 lexer.rb func Puppet::Parser::Lexer::skip 3282333 lexer.rb func Puppet::Parser::Lexer::find_token 4930508 lexer.rb func Puppet::Parser::Lexer::munge_token 5232851 lexer.rb func Puppet::Parser::Lexer::find_regex_token 5572052 lexer.rb func Puppet::Parser::Lexer::TokenList::lookup 6659161 parser_support.rb func Puppet::Parser::Parser::ast 7144763 lexer.rb func Hash::[] 7179650 methodhelper.rb func Hash::each 14954253 lexer.rb func Puppet::Parser::Lexer::find_string_token 17045065 lexer.rb func Array::each 20768233 - total - 132806477
Given this data, we're spending about 1/7th of the total parsing time just in the find_string_token method, which currently looks like this:
def find_string_token
matched_token = value = nil
# We know our longest string token is three chars, so try each size in turn
# until we either match or run out of chars. This way our worst-case is three
# tries, where it is otherwise the number of string chars we have. Also,
# the lookups are optimized hash lookups, instead of regex scans.
[3, 2, 1].each do |i|
str = @scanner.peek(i)
if matched_token = TOKENS.lookup(str)
value = @scanner.scan(matched_token.regex)
break
end
end
return matched_token, value
end
The method is responsible for determining if the next token is a simple string-based token. I've optimized it by taking the fact that the longest string-based tokens are three characters (the <<| and |>> tokens), so I look for three character matches, then two, then one. If I don't get a match by then, then we don't have a match. I could probably optimize further, since these three character tokens are pretty darn rare, especially compared to the one character tokens, but I'd need to hard-code a lot more knowledge about the token list, and really, this iteration should be delimited by an automatic determination of the longest token, rather than hard-coding it.
This really isn't a very good write-up of what I did with DTrace or how it was helpful, other than showing the interesting differences between the exclusive and inclusive data, and letting you know that the rb_calltimes.d script is the one to start with, but hey, that's more than I could find when I started looking, so hopefully this will get you somewhere.
I expect to continue spending more time using DTrace for optimizations, and I'll hopefully start uploading my data so I don't have to worry about taking these snapshots. Graphs would sure be nice....
Tue, 05 Feb 2008 | Tags: tools, dtrace, puppet, performance, optimization, profiling, profiler
A bit more DTrace
(This should have been posted a while ago, but I guess I had a problem and it's been sitting uncommitted for a while.)
After pulling apart the skip method in the lexer, so that the various parts are in separate methods, I get this as my count:
Puppet::Parser::Lexer munge_token 56778 358 20335592 Class new 28242 889 25132822 Puppet::Parser::Parser ast 25881 1147 29695496 Fixnum < 1817071 16 30723097 StringScanner check 1829886 26 48732560 String length 3757782 20 78611361 Puppet::Parser::Lexer::TokenList each 56778 6618 375813485 Puppet::Parser::Lexer find_token 56778 6714 381227038 Hash each 84949 4563 387630769 Puppet::Parser::Parser import 9 45754308 411788774 Puppet::Parser::Parser _reduce_132 9 45755009 411795083 Object catch 56018 8086 452970031 Puppet::Parser::Lexer scan 173 2751816 476064309 Racc::Parser _racc_yyparse_c 173 2751907 476080064 Object __send__ 173 2751984 476093248 Racc::Parser yyparse 173 2752322 476151712 Puppet::Parser::Parser parse 173 2752742 476224530 Array collect 331 1446548 478807659 Array each 26303 18476 485983221
The interesting one there is the Lexer.find_token method -- I just created that, and it looks like it's taking 38/48 of the total parse time, which is a helluva lot.
This method is responsible for picking the token to return, and the complicated aspect of the method is that it has to return the longest match, which is currently done by matching each token in turn (skipping those that don't match), and picking the longest match. This is expensive, because it means that every token is iterated over for every returned token, which means it scales at O(N^2), which is bad.
Mon, 28 Jan 2008 | Tags: tools, dtrace, ruby, programming
A first pass at DTrace
I've never really spent much time optimizing Puppet except in those areas that get particular complaints (and not always then), but now that I'm forced to run Leopard I figured I should see if I can put DTrace to use.
The first pass used the functime.d script, which tells me how long Puppet spends in each function. I couldn't get the file to execute directly, and I also couldn't get it to execute my script for me (which is a pretty good indication that I don't really know how to use DTrace), so I added the ability to pause my test script, giving me time to start dtrace. So, I run my test script, which I'm using to test parse time:
~/puppet/ext/puppet-test --modulepath /Users/luke/Desktop/puppet-stanford/modules/ -s parser -t parse --manifest ~/Desktop/puppet-stanford/master/manifests/site.pp -p
Then I run the dtrace script:
sudo dtrace -s ./functime.d -p 45847 2>&1 | tee functimes.log
This takes a heckuva long time to run (380 seconds or so, vs. about 6 normally), but in the end I get a big file that has histograms for all of the classes and methods, along with a sorted list of how long Puppet spends in each method. E.g., here's a histogram:
Puppet::Parser::Parser parse
value ------------- Distribution ------------- count
8388608 | 0
16777216 | 1
33554432 | 0
67108864 | 1
134217728 |@@@@@@@ 30
268435456 |@@@@@@@@@ 39
536870912 |@@@@@@@@@@@ 49
1073741824 |@@@@@ 21
2147483648 |@@@ 14
4294967296 |@@ 10
8589934592 |@ 6
17179869184 | 2
34359738368 | 0
And here's a few of the methods:
Puppet::Parser::Parser ast 25881 1090 28219110 NilClass nil? 1982238 19 38713614 StringScanner check 2044008 24 50380323 Hash each 84949 2789 236945697 Puppet::Parser::Parser import 9 29288385 263595467 Puppet::Parser::Parser _reduce_132 9 29289048 263601440 Object catch 56018 5403 302711262 Puppet::Parser::Lexer scan 173 1752645 303207689 Racc::Parser _racc_yyparse_c 173 1752730 303222439 Object __send__ 173 1752803 303234951 Racc::Parser yyparse 173 1753138 303292971 Puppet::Parser::Parser parse 173 1753536 303361785 Array collect 331 925551 306357434 Array each 26303 11912 313340970
The first annoying thing to notice about this is that this test is clearly collecting total time between method entry and exit, not the total time that we're in a method, which makes it a bit less useful for testing.
The next thing to notice is that we're calling nil? and check a ton of times, which adds up even though they're individually very cheap.
If we add up all of the calls to check and nil?, we get a bit less than half of the total run time of the parse method (which is the entry point to all of this code), which means they're having a big impact.
This really isn't anything I couldn't get from normal Ruby profiling, but based on my experience working with Brendan a bit at OSCON last year, I know there's much more available.
My next post on DTrace will hopefully include me covering how I used it to drill down a bit further.
Mon, 28 Jan 2008 | Tags: tools, dtrace, development
A Better Signature Generator
I just discovered Signature Profiler, which is a great plugin for Mail.app (works on Leopard and Tiger) for creating signatures in Mail.app. Finally, I can get rid of the painfully hackish python (!!) plugin I was maintaining, which is good since it apparently didn't work in Leopard anyway. I never could figure out how to make it provide the signatures without a leading space on each line, which was pretty annoying.
This plugin provides plenty of nice options for managing signatures, but the main thing I wanted was to be able to include the output from my long-standing signature generation script (which largely just pulls a random file from a directory of the quotes I've collected over the years).
I also took the opportunity to trim my signature list; some of the quotes were funny in 1997 but not so much now.
Sat, 05 Jan 2008 | Tags: tools, software, email, signature, osx, macosx
Git, one month on
I've been using Git for about a month now. Overall, everyone has been right about it -- it's got some heinous usability problems, but man is it kick ass to have distributed version control.
For instance, I've taken a few trips since I switch to Git, and I've committed on an airplane at least twice now. This seems like a small thing, in that I could always wait to commit, but I'm often surprisingly productive in planes, and there are plenty of things you can't actually recover from in SVN without the full repository (e.g., moving directories around).
The cool things about Git don't all require its distributed aspect -- for instance, its branching is far superior ot SVN's (if you could say SVN even has branching). I found myself three commits into some work last week that really should have been a separate branch. With Git, this was really easy to do -- I branched from the current state, then rewound the current branch to remove the commits I didn't want in it.
I was in a branch named indirection, and I decided it made sense to make a new branch named configurations.
Using the git reset man page, this is what I did:
$ git branch configurations $ git reset --hard HEAD~3 $ git checkout configurations
This left me in the new branch I wanted and left the indirections branch in the state it was at before I made the big changes.
It's clearly not all peaches and cream, though. As I mentioned, there are definite usability issues. It's not so much that you can't figure it but that it's just seldom what you expect. It doesn't help that the majority of the examples are from Linus's life, and his life is far more complicated than most, in terms of managing repositories.
The mechanism for pulling, fetching, and pushing branches is especially counterintuitive.
Overall, though, I'm very happy with it.
Sun, 23 Sep 2007 | Tags: tools, git, scm, dscm, svn, cvs
Linus on Git responding to KDE
Linus Torvalds posted a lengthy response to someone from the KDE community about using Git with KDE, and it's definitely worth a read:
Practically speaking, you'd generally have one or a few central repositories, yes. But no, it really doesn't have to be a single one. And I'm not just talking about mirroring (which is really easy with a distributed setup), I'm literally talking about things like some people wanting to use the "stable" tree, and not my tree at all, or the vendor trees.
And they are obviously connected, but it doesn't have to be a totally central notion at all.
Think of the git trees as people: some people are more "central" than others, but in the end, the kernel is actually fairly unusual (at least for a big project) in having just one person that is so much in the "center" that everybody knows about him.
Mon, 27 Aug 2007 | Tags: tools, git, kde, scm, dscm, svn, cvs
Giving Git a run-out
Something apparently snapped while I was at OSCON, and I apparently collapsed my distributed source control management quandary down to Git. I think in the end it doesn't matter all that much, since they're so similar in basic functionality, and I think I mostly got tired of sitting on the fence looking over but not being willing to commit to a specific dSCM.
Once I decided I'd go ahead with Git, my main priority was to get to the point where I could do my development on Puppet in it, which is especially important since it's the only real way for me to figure out if it will work for me, not that I really know what "work for me" means.
There are two crucial steps to testing an SCM for me: Getting Puppet's code into it, with as much history as possible, and making it available for others to have access to.
Getting the code was moderately easy, but made harder by the fact that when I first made my Subversion repository, when SVN was just starting to get popular, so I started without the typical branches/tags/trunk directory set. Here's the command I used in the end:
git svnimport -A ~/puppet-users -i -v http://reductivelabs.com/svn/puppet/ > /tmp/git.out
I tried git-svn, but it never got past revision 567 or so (which is when I switch to the popular directory structure). In addition, I was never able to actually get a working copy of the repository up to that point.
The puppet-users file contains a mapping from svn-style user names to email addresses:
luke = Luke Kanies <luke@domain.com> lutter = David Lutterkort <dlutter@domain.com> mpalmer = Matthew Palmer <mpalmer@domain.org>
I redirect output to a file, because it produces a bunch of output (I've got about 2800 revisions) and I don't actually care about any of it, and in addition, because I use iTerm, it takes a whole freaking cpu to scroll a terminal.
This basically worked, except that it started at revision 600 (arbitrarily close enough to the time when I changed the directory structure in the repository).
To make the repository shareable, I first just exported it via http, which was pretty easy, but then I was told I need to use git-server for performance reasons. I built a Puppet module to set it all up, and although the server doesn't work as well as I like (I really like SVN's auth file, which allows me to control who has access to the 32 repositories I maintain).
I'm getting some gritching from the Australians, and it's not like it's perfect, but at least I know I want something like that.
At the least, this has been a great experiment, and I figure we'll spend a week or so messing around with it. I'm not sure I can afford the time to experiment with all of the competitors; Matt's really pushing on darcs, but... I dunno, it seems niche, and at this point, I'm niche enough for all of us.
Tue, 07 Aug 2007 | Tags: tools, git, scm, svn, oscon, oscon07
gitDisplay all 140 possibilities? (y or n)
I guess this is what people meant when they said git was "Unixy":
luke@phage(0) $ git Display all 140 possibilities? (y or n) git git-get-tar-commit-id git-rebase git-add git-grep git-receive-pack git-add--interactive git-gui git-reflog git-am git-hash-object git-relink git-annotate git-http-fetch git-remote git-apply git-http-push git-repack git-applymbox git-imap-send git-repo-config git-applypatch git-index-pack git-request-pull git-archimport git-init git-rerere git-archive git-init-db git-reset git-bisect git-instaweb git-rev-list git-blame git-local-fetch git-rev-parse git-branch git-log git-revert git-bundle git-lost-found git-rm git-cat-file git-ls-files git-runstatus git-check-ref-format git-ls-remote git-send-email git-checkout git-ls-tree git-send-pack git-checkout-index git-mailinfo git-sh-setup git-cherry git-mailsplit git-shell git-cherry-pick git-merge git-shortlog git-citool git-merge-base git-show git-clean git-merge-file git-show-branch git-clone git-merge-index git-show-index luke@phage(0) $ git
I think I'm going to be sick.
Tue, 17 Jul 2007 | Tags: tools, scm, git
[1] 2 3 >>