Running a documentation build separate from a software build

HIMEM.SYS

Running a documentation build separate from a software build

2018-09-11 07:10:25 +0000 ·

We have a build that just generates the API documentation for a software. It does so using docfx.exe, which does not require the binaries, but only the source code.

We didn’t want to include the documentation generation in the software build, because it takes some extra time. On the other hand we wanted to make sure that the documentation always reflects a successful build of the software and not some intermediate state.

The idea is as follows:

Let the software be build using whatever form of TFS build is setup.
At a fixed time, normally in the night, run a separate scheduled build that is only responsible for building the documentation. To fullfil the requirement of working with a “stable set” of sources, this build needs to somehow determine the branch and change set that was used for the software build. It will then need to get (checkout) this version of the sources.

The TFS build system does not support such a feature, so we’ll have to roll our own process. We can, however, make use of some features and facilities that TFS does provide to not only make our process integrate neatly into the TFS build process, but also to make it more stable.

In this post we will use a PowerShell script, but really you could use any other language/technology (or even a custom build task) that is capable of performing the required actions.

Authentication

In the following our build script will perform certain actions against the TFS server. To do so, it must authenticate itself. You could do this using multiple ways. For example, you could create a PAT (personal access token), and use that or you could create a specific user and use that from the script. Both variants are cumbersome and require some maintenance (e.g. updating the token when it expires or changing the password if necessary) or downright insecure (storing the user’s password in the script - at least you could keep it as an encrypted variable of the build definition).

Luckily, there is a better way to handle this: you use the same credentials that the build definition/process itself uses. For example, when it runs the “Get Sources” task, the agent processing the build will also have to authenticate against the server. It does so using an access token. This token can be accessed from build scripts/tasks when the build definition allows. To enable this, you need to set the “Allow scripts to access OAuth token” setting in the “Options” of your build definition.

Allow scripts to access OAuth token

Once this is done, the environment variable SYSTEM_ACCESSTOKEN will be made available to scripts to access the access token:

You can then pass that to tf.exe using the the /login and /loginType command line options.
You can pass the token when invoking TFS REST API using the Authorization HTTP header header.

Finding the last successful build

First we need to know the build definition that produces the builds we’re interested in. For this post we just assume it is “PROD-B-1.0”.

Having this, we need to query TFS to find the last successful build and extract the required branch and change set information from that.

For that purpose we can use the TFS REST API to send a HTTP GET request to the API using an URL like the http://<servername>/<collection>/<project>/_apis/build/builds?<query>. using the Invoke-RestMethod powershell command. As with most other system and build specific information, many parts that comprise the URL are available as variables, that is can be accessed as environment variables.

$definition = 'PROD-B-1.0'
$uri = $env:SYSTEM_TEAMFOUNDATIONCOLLECTIONURI `
        + $env:SYSTEM_TEAMPROJECT `
        + "/_apis/build/builds?definition=" + [Uri]::EscapeUriString($definition) `
        + "&`$top=1" `
        + "&status=Complete

$info = Invoke-RestMethod -Uri $uri -Headers @{ Authorization = "Bearer $env:SYSTEM_ACCESSTOKEN" }
if (!$info -Or !$info.value -Or ($info.value.Length -ne 1)) {
    throw "No unique build found"
}

For more information on the URI and parameters see the documentation. In short we’re asking TFS to return the newest ($top=1) build of our build definition that has succeeded (status = Completed).

Thanks to Invoke-RestMethod, the result is already a PSObject and we can pick individual members:

$branch = $info.value[0].sourceBranch
$changeSet = $info.value[0].sourceVersion

Note: Had we been using Invoke-WebRequest instead, we could have used ConvertFrom-Json to manually create a PSObject from the resulting JSON-formatted string. Further note, that ConvertFrom-Json has issues with JSON strings that are not on a single line (like formatted/indented). So it is probably best to just use the newer Invoke-RestMethod instead of Invoke-WebRequest | ConvertFrom-Json.

Getting the sources

Now we need to get (“checkout”) the sources for the build we found. Using TFVC this is not as easy as it could be, because we first need to create a workspace where we can get them into. We cannot use the workspace that the agent has created for the currently, running build, because that is already in use.

A typical agent source directory would be C:\agent\<name>\_work\1\s which the agent has mapped to a workspace named, say, ws_1_24.

For the sources we want to get, we’ll need to create a directory that is outside of the above (because you cannot have nested workspaces) and choose a workspace name that is still unique.

We choose a directory that will be C:\agent\<name>\_work\1\_src. This is still within the agent’s working directory (C:\agent\<name>\_work\1) and thus cannot interfere with others, but is yet outside of the agent’s workspace directory. As the workspace name we simply use the name that the agent uses, plus a suffix: ws_1_24_src. So also this will be unique.

Finally, we do similar as the “Get Sources” task would do (you can check what it does by setting the system.debug variable to true and checking your build log):

Create the working directory (here C:\agent\<name>\_work\1\_src) if it does not yet exist.
Check if the workspace (here ws_1_24_rc) already exists.
- If it exists, delete it.
Create the workspace
Unmap the $/ from the workspace, just in case. We don’t want to accidentally get the complete repository if something is messed up.
Map the “branch” repository path from the previous step to the working directory.
Get the working directory.

Before we show the actual steps, let’s first talk about properly invoking tf.exe, which will perform all of the above tasks for us.

The first thing to know would be where tf.exe is actually located, that is, what the full path to the executable is. Luckily, we don’t have to! TFS already did that during the “Get Sources” step. Look at the log output carefully:

2018-09-11T13:49:30.9421255Z ##[section]Starting: Get Sources
2018-09-11T13:49:31.0827667Z Prepending Path environment variable with directory containing 'tf.exe'.
...

So basically, for the rest of our build we can simply invoke tf.exe without any path information. (In case you’re curious, it is actually located in F:\agent\<name>\externals\vstsom, but I would consider this an implementation detail).

Now that we know how to invoke it in principal, there are some things to take care of when doing so. Above we have already seen that /loginType:OAuth is required to use the agent’s access token to authenticate. Also, you should pass the /noprompt option, to make sure no interactive questions are asked by the tool - which makes little sense in an automated build.

For the following, we have build the following helper functions:

function WriteCommand($commandLine) {
    Write-Host "##[command]$commandLine"
}

function GetTfCommandLine($arguments, $needCollection = $true, $needLogin = $true) {
    $tfc = "tf.exe vc $arguments"
    if ($needCollection) {
        $tfc += " /collection:$env:SYSTEM_TEAMFOUNDATIONCOLLECTIONURI"
    }
    if ($needLogin) {
        $tfc += " /loginType:OAuth `"/login:.,$env:SYSTEM_ACCESSTOKEN`""
    }
    $tfc += " /noprompt"
    return $tfc
}

function RunTfCommand($message, $arguments, $needCollection = $true, $needLogin = $true) {
    Write-Host $message
    $tfc = GetTfCommandLine $arguments $needCollection $needLogin
    WriteCommand $tfc
    Invoke-Expression $tfc
    if ($LASTEXITCODE -ne 0) {
        throw "$message. Failed with exit code $LASTEXITCODE."
    }
}

Note the use of the Write-Host "##[command]..." call to log the command for diagnostics. It is one of a couple of logging commands that TFS supports. This one has the benefit of automatically hiding sensitive information from the output (here the system access token). For example, instead of logging

tf.exe vc get ... /login:.,<tokenstring>

it will automatically log only

tf.exe vc get ... /login:.,*******

to build console and log.

OK, let’s look at the steps layed out above in code:

Create the working directory (here C:\agent\<name>\_work\1\_src) if it does not yet exist.

$WorkingDir = [System.IO.Path]::GetFullPath($env:BUILD_SOURCESDIRECTORY + '\..\_src')

if (!(Test-Path -Path $WorkingDir -PathType Container)) {
    Write-Host "Creating working directory '$WorkingDir'"
    mkdir $WorkingDir
} else {
    Write-Host "Cleaning working directory '$WorkingDir'"
    del -Force -Recurse "$WorkingDir\*.*"
}

Check if the workspace (here ws_1_24_rc) already exists. If it exists, delete it.

$WorkspaceName = $env:BUILD_REPOSITORY_TFVC_WORKSPACE + '_src'

Push-Location $WorkingDir

Write-Host "Looking for existing workspace '$WorkspaceName'"
$getWorkspaces = GetTfCommandLine "workspaces /format:XML"
WriteCommand $getWorkspaces
$workspaces = [xml](Invoke-Expression $getWorkspaces)
$existingWorkspace = $workspaces.workspaces.workspace | Where-Object { $_.name -eq $WorkspaceName }
if ($existingWorkspace) {
    RunTfCommand "Removing existing workspace '$WorkspaceName'" "workspace /delete `"$($WorkspaceName);$($existingWorkspace.owner)`""
}
RunTfCommand "Creating workspace '$WorkspaceName'" "workspace /new /location:local /permission:Public $WorkspaceName"

A couple of things are going on here.

We create the name of our workspace using the agent’s workspace name, plus our suffix _src.
We need to change to the working directory. As most of the following commands operate on this implicitly.
We execute the tf vc workspaces /format:XML command. So in PowerShell we can cast the result to XML and then operate on the result as an object.
If our workspace already exists, we delete it using the tf vc workspace /delete command.
Finally, we (re)create the workspace using the tf vc workspace /new command.

Unmap the $/ from the workspace, just in case. We don’t want to accidentally get the complete repository if something is messed up. Map the “branch” repository path from the previous step to the working directory.

RunTfCommand "Unmapping workfolder" "workfold /unmap /workspace:$WorkspaceName $/"
RunTfCommand "Mapping workfolder" "workfold /map `"$branch`" `"$WorkingDir`" /workspace:$WorkspaceName"

Get the working directory.

RunTfCommand "Getting sources" "get /version:C$changeSet /recursive `"$WorkingDir`" /overwrite" $false

Where are we now

At this point the build process has gotten the sources in the version (i.e. change set) that has been used for the latest successful build at this time.

Now, we can invoke the generation of the actual documentation using these sources. What actually happens here depends on your build scripts, that you have for generation your documentation anyway.

You should put the above PowerShell code into a script, say Prepare.ps1, put it into your repository (at a place different than the sources you want to get later on) and invoke it as a step in the build pipe line.

For example, your build pipeline could look like this:

“Get Sources” - this will fetch the $/MyProject/Tools/ folder, which contains the Prepare.ps1 script with the code from above.
“Run PowerShell” - Create a task that will run the $(build.sourcesDirectory)\Prepare.ps1 script; in real live you might probably want to pass some of the things we “hardcoded” above (like the name of the build definition to find the newest build of), as build process variables.
Run whatever task/step you want based on the sources that are now in $(build.sourcesDirectory)\..\_src.