Tag Archives: Git

A Better ansible_managed Tag

If you’re using Ansible to manage configuration files on your servers, you’re probably familiar with the {{ ansible_managed }} tag .

This tag is well behaved from an idempotentcy perspective, but in the nitty-gritty of managing servers where there’s a chance of interaction by humans, this built-in tag leaves a lot to be desired.

Enter Ansible filter plugins, and a little bit of magic in group_vars.

This filter plugin will obtain (according to git pretty-formats) the commit hash, author, and date of the last commit to the file provided by the filter plugin structure:

#!/usr/bin/python
# this should be saved to ./filter_plugins/get_last_commit_to_file.py

class FilterModule(object):
  def filters(self):
    return {
      'get_last_commit_to_file': self.get_last_commit_to_file
    }

  def get_last_commit_to_file(self, file_path):
    import os
    try:  # py3
      from shlex import quote
    except ImportError:  # py2
      from pipes import quote
    stream = os.popen('git log -n 1 --pretty=format:\'%h by %an at %aD\' -- ' + quote(file_path))
    output = stream.read()
    return output

So, now we can define two new variables in our group_vars/all.yml file: ansible_workspace_git_url and ansible_managed_custom.

Notice the use of our custom filter plugin – we can pipe the variable template_path variable (which Ansible gives to us for free) into our filter plugin to obtain a string about the git history of our file!

ansible_workspace_git_url: "https://github.com/myrepo"

ansible_managed_custom: "
Ansible Managed for {{ inventory_hostname }} ({{ansible_system_vendor}}::{{ansible_product_name}}::{{ansible_product_serial}}) \n
Ansible Git Repository URL: {{ ansible_workspace_git_url }}\n
Inventory File: {{ inventory_file|replace(inventory_dir,'') }}@{{ inventory_file | get_last_commit_to_file }}\n
Template file: {{ template_path|replace(inventory_dir,'') }}@{{ template_path | get_last_commit_to_file }}\n
Role Context: {{ role_name | default('None') }}\n
Rendered Template Target: {{ template_destpath }}"

We now have available for use inside any template a nicely formatted (but not necessarily idempotent) header which gives any readers DETAILED information including:

  • The hostname and BIOS serial number of the host to which this file was rendered
  • The git repository of the playbook responsible for this template
  • The filename of the template used to render this file
  • The hash of the last commit which affected the template used to render this file
  • The filename of the inventory supplying variables to render this template
  • The hash of the last commit which affected the inventory supplying variables to render this template
  • The target path on the target server to where this file was rendered

Now, anytime I’m templating out a file (well, at least any file that supports multiline comments), I can just include our new ansible_managed_custom variable:

<!--
  {{ ansible_managed_custom }}
--> 

By switching to this header format, I was able to better assist my teammates with identifying why a template was rendered in a particular fashion, as well as provide them details about how to fix any problems they may find. This template format also helps answer questions about whether the rendered file was _copied_ from one server to another (possibly in a migration) and whether the file is likely to still be under Ansible management.

Obtaining Git Repo URL from Jenkins Changeset: Unsolved

I’m attempting to obtain a Git Repo URL from a Jenkins Change set in a Groovy Scripted Pipeline, but I keep running into the same issue: the browser property (obtained via .getBrowser()) on my hudson.plugins.git.GitChangeSetList object is undefined.

I’m running the below code (with inline “status comments”) from the Jenkins groovy script console in an attempt to extract the RepoUrl from Chagesets in a Jenkins Multi branch groovy scripted pipeline:

def job = Jenkins.instance.getItem("MyJenkinsJob") 
def branch = job.getItems().findAll({ 
            item -> item.getDisplayName().contains("Project/CreateChangeLogs")
        })

printAllMethods(branch[0].getFirstBuild()) //this works, and is a org.jenkinsci.plugins.workflow.job.WorkflowRun

def builds = branch[0].getBuilds()
def currentBuild = builds[0]

currentBuild.changeSets.collect { 
  printAllMethods(it) // this works too, and is a hudson.plugins.git.GitChangeSetList.
  // enumerated methods are equals(); getClass(); hashCode(); notify(); notifyAll(); toString(); wait(); createEmpty(); getBrowser(); getItems(); getKind(); getRun(); isEmptySet(); getLogs(); iterator(); 

  it.getBrowser().repoUrl // this fails
  // the error is java.lang.NullPointerException: Cannot get property 'repoUrl' on null object
}

I found the utility class for PrintAllMethods here (https://bateru.com/news/2011/11/code-of-the-day-groovy-print-all-methods-of-an-object/):

  void printAllMethods( obj ){
    if( !obj ){
        println( "Object is null\r\n" );
        return;
    }
    if( !obj.metaClass && obj.getClass() ){
        printAllMethods( obj.getClass() );
        return;
    }
    def str = "class ${obj.getClass().name} functions:\r\n";
    obj.metaClass.methods.name.unique().each{ 
        str += it+"(); "; 
    }
    println "${str}\r\n";
}

The API spec for GitChangeSetList indicates that in extends hudson.scm.ChangeLogSet which implements getBrowser(), so the call should be valid.

Additionally, the source for GitChangeSetList invokes the super() constructor with the passed browser object.

At this point, I’m probably going to continue diving through the source code until I figure it out.

It looks like this is a documented issue with Jenkins: https://issues.jenkins-ci.org/browse/JENKINS-52747

And a (somewhat) related StackOverflow post: https://devops.stackexchange.com/questions/3798/determine-the-url-for-a-scm-trigger-from-inside-the-build

Git: Replace Root Commit with Second Commit

While migrating code between version control systems (in my case SourceGear Vault to Git using an open-source c# program called vault2git), it’s sometimes necessary to pre-populate the first commit in the target system.

This yields an empty commit (git commit -m "initial commit" --allow-empty) with a timestamp of today, which is chronologically out of order of the incoming change set migration.

After completing the migration, the second commit is actually the commit which I’d like to be the root.

It took me a while to figure this out, but thanks to
Greg Hewgill on Stack Overflow, I was able to replace the first commit of my branch with the second commit (and subsequently update the SHA1 hashes of all child commits) using this command:

git filter-branch --parent-filter "sed 's/-p <the__root_commit>//'" HEAD

Git Rebase

So, You’ve developed this great new feature and you’re ready to submit the code for inclusion into the project. You hit “pull request,” and patiently wait for feedback.

Then it happens.

Someone says “Can you merge this into [insert parent branch name here’]. You get a sinking feeling in your stomach, and say “oh no! now I have to make all of my changes over again from that branch.

Never fear, this is what rebasing is for!

In this case, you need to tell git to take the commits you’ve added, and play them back against a different branch.  The process goes something like this:


 git rebase --onto master myNewFeatureBranch~1 myNewFeatureBranch

If all goes well, you’ll end up in the same branch, all of your changes will be intact, and the result of (diff to master) will be only your changes!!!

Also Collapse commits: http://stackoverflow.com/questions/6884022/collapsing-a-group-of-commits-into-one-on-git