All posts by crossan007

Display HTTPS X509 Cert from Linux CLI

Recently, while attempting a git pull, I was confronted with the following error:

Peer's certificate issuer has been marked as not trusted by the user.

The operation worked on a browser on my dev machine, and closer inspection revealed that the cert used to serve the GitLab service was valid, but for some reason the remote CentOS Linux server couldn’t pull from the remote.

I found this post on StackOverflow detailing how to retrieve the X509 cert used to secure an HTTPS connection:

echo | openssl s_client -showcerts -servername MyGitServer.org -connect MyGitServer.org:443 2>/dev/null | openssl x509 -inform pem -noout -text

This was my ticket to discover why Git on my CentOS server didn’t like the certificate: the CentOS host was resolving the wrong DNS host name, and therefore using an invalid cert for the service.

And now a Haiku:

http://i.imgur.com/eAwdKEC.png

Git: Replace Root Commit with Second Commit

While migrating code between version control systems (in my case SourceGear Vault to Git using an open-source c# program called vault2git), it’s sometimes necessary to pre-populate the first commit in the target system.

This yields an empty commit (git commit -m "initial commit" --allow-empty) with a timestamp of today, which is chronologically out of order of the incoming change set migration.

After completing the migration, the second commit is actually the commit which I’d like to be the root.

It took me a while to figure this out, but thanks to
Greg Hewgill on Stack Overflow, I was able to replace the first commit of my branch with the second commit (and subsequently update the SHA1 hashes of all child commits) using this command:

git filter-branch --parent-filter "sed 's/-p <the__root_commit>//'" HEAD

Intermittently Slow IIS web site

TL;DR:

An issue in the Windows Management Instrumentation (WMI) performance counter collection process caused periodic system-wide performance degradation.


This issue became visible when our infrastructure monitoring software invoked specific WMI queries.


We disabled the specific WMI query set which was causing the performance issues, and the problem went away.

A few days ago one of our clients began reporting performance issues on one of their web sites. This site is an IIS web application responsible for rendering visualizations of very large data sets (hundreds of gigabytes). As such, the application pool consumes a corresponding amount of RAM (which is physically available on the server).

Normally, these sites (I manage a few hundred instances) are fast, with most queries returning in under 300ms; however, this one instance proved difficult. To make matters worse, the performance issues were intermittent: most of the time, the site was blazing fast, but sometimes the site would hang for minutes.

Given a few hours of observation, one of my team members noticed a correlation between the performance issues of the site and a seemingly unrelated process on the host: WmiPrvSe.exe

I began digging in, and was able to corroborate this correlation by looking at the process’s CPU usage over time (using ELK / MetricBeat to watch windows processes). Sure enough, there’s a direct correlation between WmiPrvSe.exe using ~3-4% CPU, and IIS logs indicating a timeTaken of greater than 90 seconds. This correlation also established an interval between instances of the issue: 20 minutes.

I fired up Sysinternals’ ProcMon.exe to get a better handle on what exactly WmiPrvSe.exe was doing during these so-called “spikes”. I observed an obscene count of Registry queries to things looking like Performance counters (RegQueryValue, RegCloseKey, RegEnumKey, RegOpenKey). Note that there are multiple instances of WmiPrvSe.exe running on the sytsem, but only one instance was “misbehaving:” the one running as NT AUTHORITY\SYTSEM (which also happens to have the lowest PID). The instances running as NT AUTHORITY\NETWORK SERVICE and as NT AUTHORITY\LOCAL SERVICE did not seem to be misbehaving.

Almost all of the registry keys in question contained the string Performance or PERFLIB; many (but not all) queries were against keys within HKLM\System\CurrentControlSet\Services.

I know that I have the Elastic Co.’s “Beats” agents installed on this host; could Metricbeat, or one of my other monitoring tools be the culprit? So, I tried disabling all of the “beats” agents (filebeat, metricbeat, winlogbeat, etc), but was still seeing these intermittent spikes in WmiPrvSe.exe CPU usage correlating with slow page loads from IIS.

Stumped, I searched for how to capture WMI application logs, and found this article: https://docs.microsoft.com/en-us/windows/desktop/wmisdk/tracing-wmi-activity.

I ran the suggested command (Wevtutil.exe sl Microsoft-Windows-WMI-Activity/Trace /e:true) and fired up Event Veiwer (as admin) to the above path. Bingo.

Log hits inMicrosoft-Windows-WMI-Activity/Trace included mostly checks against the networking devices:select __RELPATH, Name, BytesReceivedPersec, BytesSentPersec, BytesTotalPersec from Win32_PerfRawData_Tcpip_NetworkInterface

These WMI queries were executed by the ClientProcessId owned by nscp.exe.

I perused the source code for NSCP a bit, and discovered that the Network queries for NSCP are executed through WMI(
https://github.com/mickem/nscp/blob/master/modules/CheckSystem/check_network.cpp#L105 ); while the standard performance counter queries were executed through PDH (https://github.com/mickem/nscp/blob/master/modules/CheckSystem/pdh_thread.cpp#L132):

Something else I noticed was that the Microsoft-Windows-WMI-Activity/Operational log contained events directly corresponding to the issue at hand: WMIProv provider started with result code 0x0. HostProcess = wmiprvse.exe; ProcessID = 3296; ProviderPath = %systemroot%\system32\wbem\wmiprov.dll

Some more creative google searches yielded me an interesting issue in a GitHub repo for a different project: CPU collector blocks every ~17 minutes on call to wmi.Query #89 .

Sounds about right.

Skimming through the issue I see this; which sets off the “ah-ha” moment:

Perfmon uses the PDH library, not WMI. I did not test with Perfmon, but PDH is not affected.


leoluk commented on Feb 16, 2018 (https://github.com/martinlindhe/wmi_exporter/issues/89#issuecomment-366195581)

Now knowing that only NSCP’s check_network uses WMI, I found the documentation to disable the network routine in nscp’s CheckSystem module: https://docs.nsclient.org/reference/windows/CheckSystem/#disable-automatic-checks

I added the bits to my nsclient.ini config to disable automatic network checks, restarted NSCP, and confirmed the performance issue was gone:

[/settings/system/windows]
# Disable automatic checks
disable=network

I’ve opened an issue on NSCP’s GitHub page for this problem: https://github.com/mickem/nscp/issues/619

Related Reading:

Tail a file on Windows

On almost every Unix system, we have tail -f to watch the end of *really really big* files.

When faced with a 36GB log file on Windows the tooling is often lacking.

I borrowed / adapted a little PowerShell function to extract the last n log lines from a file, and write to a new file:
https://gist.github.com/crossan007/b5e8ac4579ba61eb1967315657406751

Partially borrowed from: https://stackoverflow.com/questions/36507343/get-last-n-lines-or-bytes-of-a-huge-file-in-windows-like-unixs-tail-avoid-ti

TIL: Java code in Jenkins pipelines run on the Master

I was trying to read a file with Java.io.File in a Jenkins Groovy Scripted Pipeline on a non-master node. I kept getting an exception that the file was not found (java.io.FileNotFoundException)

Turns out that Java code written in scripted pipelines (Groovy) runs on the master node: https://issues.jenkins-ci.org/browse/JENKINS-37577. This is as-designed behavior, and accessing files in the workspace on a non-master node should use the readFile function in the Pipeline Basic Steps DSL https://jenkins.io/doc/pipeline/steps/workflow-basic-steps/#pwd-determine-current-directory

I’m thoroughly embarrassed at how many failed Jenkins jobs and alerts I’ve triggered while discovering this.

Windows 10 Password Recovery

DISCLAIMER: DO NOT EXECUTE THIS PROCESS WITHOUT EXPLICIT APPROVAL FROM THE SYSTEM OWNERS.  I AM NOT ENDORSING OR APPROVING ANY ILLEGAL ACTIVITY WHICH COULD BE ACCOMPLISHED FOLLOWING THESE STEPS

An older friend forgot his computer password; asked me for help.

I booted the machine, and saw an email address where the Windows 10 username normally would be;  my first thought was “oh, great; this is a Microsoft Online  joined computer, password recovery probably won’t happen”

I did a little research, and found some evidence that suggests my seemingly outdated knowledge about passwords being stored in the SAM seems to still stand.  However, Windows 10 Anniversary Update changed the encryption algorithm used on the SAM: https://twitter.com/gentilkiwi/status/762465220132384770

This algorithm change broke my normal tool (OPHCRACK), since it was unable to read the NTLM hashes from the SAM.  SAM encryption caused OPHCRACK to incorrectly read every account hash as 31d6cfe0d16ae931b73c59d7e0c089c0.  So, I copied the SAM and SYSTEM files (at C:\Windows\System32\config) from the target machine to my desktop for additional processing.

Mimikatz has a module `lsadump::sam` which accepts parameters for offline SYSTEM and SAM decryption.  Easy command line:

lsadump::sam /system:c:\users\charles\documents\system /sam:c:\users\charles\documents\sam

This returned decrypted NTLM hashes for easy cracking.

I decided to try a new tool here to crack the plain text password from the NTLM hashes: Hashcat.  There’s a Windows 64bit compiled version (I know, I know don’t run random binaries…) which made it easy to get cracking quickly.

I copied the hash from the output of Mimikaz into a text file called hashes.txt and ran the command

.\hashcat64.exe -m 1000 -a 3 -O -o pass1.txt .\hashes.hash

My 10 year old computer cracked the Microsoft Online account NTLM Windows 10 password hash in ~8 minutes. It was two dictionary words and a two-digit number for a total of 8 characters.  I was using brute-force in this scenario, so the fact that dictionary words were used is of no consequence.  Had I been using a dictionary, the attack would have likely concluded sooner.

Just for fun, I generated a new NTLM hash, but replacing vowels with numbers (i with 1 and the e with 3 and so fourth), the attack took the same amount of time.


import hashlib
print hashlib.new('MD4', 'password'.encode('utf-16le')).hexdigest()

Moral of the story:  USE STRONG PASSWORDS AND A PASSWORD MANAGER

Un-Approve SharePoint List Item Previous Versions

I recently had a change request against a SharePoint Forms Library I had created a few years ago – the request was to adjust the permissions so that form submitters could see only the forms that they’ve submitted (and not others).

This is a generally straightforward action on new libraries: enable ” Require content approval for submitted items?”, and change “Who should see draft items in this document library?” to

However, enabling these settings seems to have caused the items that already existed to have an Approval Status of “Approved, ” despite a pending Approval workflow.  This caused the undesired effect of allowing users who do not hold the “Approve” permission level to access previous version of items still in the approval workflow.

I needed to reject previous versions of forms where the current version had not yet been approved.  On lots of items.

I found numerous examples from google how to use PowerShell to set the Approval Status of list items; however, nearly every example dealt with only the current version of a list item – making no mention of altering the approval status of previous versions of list items.

Additionally, I found a few posts attempting to manipulate attributes for previous versions;  the responses for each of these inquires were varied:

  • “you can’t – history is read-only,”
  • “you can migrate the documents to a new list, and re-build the history”
  • “you can delete the old versions”

I even found a mega-thread on TechNet how to “List and Delete List Item Versions using PowerShell,” and a “Complete Guide to Getting and Setting Fields Using PowerShell”

None of these options accomplished what I was seeking:  to simply remove the approval on previous versions.

Finally, I resorted to simply poking at the objects from PowerShell (never under-estimate the power of Get-Member to explore objects) Attempting to modify the properties on a previous version would yeild the error message “Unable to index into an object of type Microsoft.SharePoint.SPListItemVersion” (Link)

Ok – different approach:  I know my desired action is feasible via the UI for single list items:

So, I opened Chrome developer tools and captured the command sent when clicking “Reject this version”: A POST to “/_layouts/versions.aspx” with the ItemID, and an “op” value of “TakeOffline”. A quick google search revealed a server-side object model equilavent: Microsoft.SharePoint.SPFile.TakeOffline

My solution: invoke SPListItem.File.TakeOffline() for every file which is currently pending and has a previously approved version:

 

SharePoint 2013 List Workflows Failing

Quick Post.

Today I had an issue with a SharePoint 2013 List Workflow not running on a SharePoint Online Team Site.

 

Retrying last request. Next attempt scheduled in less than one minute. Details of last request: HTTP  to https://<SomeCoolTenant>.sharepoint.com/sites/<SomeCoolSite>/_api/web/lists(guid'**********************************') Correlation Id:  Instance Id: *************************************

System.Net.WebException: The request was aborted: The request was canceled. ---> System.InvalidOperationException: Failed to fetch an access token from the token service. The token service returned an error type of 'unauthorized_client' with the following description: AADSTS70001: Application with identifier '**************************' was not found in the directory **************************************
Trace ID: **********************************
Correlation ID: *****************************************
Timestamp: 2017-10-30 14:07:03Z ---> System.Net.WebException: The remote server returned an error: (400) Bad Request.
at System.Net.HttpWebRequest.GetResponse()
at Microsoft.Activities.Hosting.Security.OAuthS2SSecurityTokenServiceCredential.FetchAccessToken(Uri stsUri, String targetServiceAudience, String authenticatorToken, HttpWebRequest request, TimeSpan timeout, EventTraceActivity eventTraceActivity, TimeSpan& expirationDuration)
--- End of inner exception stack trace ---
at Microsoft.Activities.Hosting.Security.OAuthS2SSecurityTokenServiceCredential.FetchAccessToken(Uri stsUri, String targetServiceAudience, String authenticatorToken, HttpWebRequest request, TimeSpan timeout, EventTraceActivity eventTraceActivity, TimeSpan& expirationDuration)
at Microsoft.Activities.Hosting.Security.OAuthS2SSecurityTokenServiceCredential.GetAccessTokenFromTokenService(OAuthS2SPrincipal client, OAuthS2SPrincipal targetServiceAudience, HttpWebRequest originalRequest, EventTraceActivity eventTraceActivity, TimeSpan& expirationDuration)
at Microsoft.Activities.Hosting.Security.OAuthS2SSecurityTokenServiceCredential.GetAuthorization(OAuthS2SAuthenticationChallenge[] bearerChallenges, HttpWebRequest request, EventTraceActivity eventTraceActivity)
at Microsoft.Activities.Hosting.Security.OAuthS2SAuthenticationModule.AuthenticateInternal(String challenge, WebRequest request, OAuthS2SCredential credential, EventTraceActivity eventTraceActivity)
at Microsoft.Activities.Hosting.Security.OAuthS2SAuthenticationModule.Authenticate(String challenge, WebRequest request, ICredentials credentials)
at System.Net.AuthenticationManagerDefault.Authenticate(String challenge, WebRequest request, ICredentials credentials)
at System.Net.AuthenticationState.AttemptAuthenticate(HttpWebRequest httpWebRequest, ICredentials authInfo)
at System.Net.HttpWebRequest.CheckResubmitForAuth()
at System.Net.HttpWebRequest.CheckResubmit(Exception& e, Boolean& disableUpload)
at System.Net.HttpWebRequest.DoSubmitRequestProcessing(Exception& exception)
at System.Net.HttpWebRequest.ProcessResponse()
at System.Net.HttpWebRequest.SetResponse(CoreResponseData coreResponseData)
--- End of inner exception stack trace ---
at Microsoft.Workflow.Common.AsyncResult.End[TAsyncResult](IAsyncResult result)
at Microsoft.Activities.Hosting.HostedHttpExtension.HttpRequestWorkItem.HttpRequestWorkItemAsyncResult.End(IAsyncResult result, Int32& responseCode)
at Microsoft.Activities.Hosting.HostedHttpExtension.HttpRequestWorkItem.OnEndComplete(ScheduledWorkItemContext context, IAsyncResult result)

 

Turns out that I had forgotten to enable the Workflows can use app permissions site feature:

So – If you’re not yet using Microsoft Flow and still need those SharePoint 2013 Workflows, remember to enable this site feature.

The Un-Deletable File

I’m not sure how I created it, but somehow I managed to create a folder called

'.

I can’t delete it from Windows Explorer, PowerShell, CMD,  [System.IO]::Delete(), or any other method I’ve attempted yet.  I also can’t delete the parent folder.

SDelete won’t work: https://technet.microsoft.com/en-us/sysinternals/sdelete.aspx

 

I’ve checked for and found no open handles on the file

I cannot move the folder (or it’s parent) to another location – It’s hard-fastened to my desktop.

My drive is BitLocker encrypted, so I can’t mount it on a Linux Device for deletion, and leveraging WinPE for BitLocker decryption would  likely subject me to the same file APIs .