Simple coloured output from CMD.EXE

It’s stupid to have to post this, but it’s even more stupid that the function isn’t built in.

Suggestions on the glorious Internet are just nasty: use PowerShell or use hacks around other built-in functions.

Instead, I’ve thrown together a 10KB C++ console app that starts and exits almost instantly.

To use it, you simply pipe what you want to display to the executable. You pass in one or two heaxdecimal digits as a parameter and that colours the output accordingly:

screenshot_colourecho

You can use the colour values in the wincon.h header file. These are fairly limited, so you may be best off trying the various values. The first value is the background and the second is the foreground. Not all values are supported for the background.

 

The code and executable are available here:

https://app.box.com/s/nnyv082efsiedhl2rrxw0qs761jm9t1w

 

 

Why Big Data Analytics is harder than you think: numerical precision matters!

How big is your Big Data data set? If it is not measured in petabytes then perhaps it doesn’t qualify. In any case, as the amount of data grows the challenges grow disproportionately and in areas that perhaps you don’t expect. In this post, I am raising a number of challenges and concerns that you should be thinking about, especially if you are interested in business analytics on large data sets.

I recommend that if you are serious about large data sets and numerical precision, you have no choice but to adopt 128-bit arithmetic in most scenarios, especially financial.

Continue reading

Performance in SQL Server 2012 degrades non-linearly as SELECT clauses are added when using ColumnStore

When adding additional SELECT clauses to an aggregating query over a large fact table and using ColumnStore, the performance degrades in a step-wise linear fashion with large steps.  It may be quicker to execute several less-complex queries rather than a single complex query.

I’ve submitted a Microsoft Connect bug report here: https://connect.microsoft.com/SQLServer/feedback/details/761895/.

Continue reading for full details, or download this attachment: https://www.box.com/s/5hp9f0ditg0fspghu506 [PDF file also available via Microsoft Connect].  Actual test results, steps to reproduce, etc., as well as more pretty graphs :), are included.

Elapsed time by SUM measure count (base query)

Continue reading

Performance in SQL Server 2012 degrades when using ColumnStore and no GROUP BY clause

Essentially, the performance of a non-grouping SQL SELECT query degrades when applied to a ColumnStore index.  This has been tested with SQL Server 2012 RTM CU1.  Performing partial aggregation can result in a 15x performance improvement in some of the cases I observed.

See the full details in my Microsoft Connect submission here: https://connect.microsoft.com/SQLServer/feedback/details/761469/. Or download the details here: https://www.box.com/s/frf7imhyclb2efz2tfvb.

Continue reading

Why software developers need to look beyond frameworks and languages

My career as a software developer has been fairly unusual as I spent about six years as an IT manager for two companies. It was not an intentional change of focus but happened unexpectedly after taking a job as an Oracle database developer.  That job changed shape immediately, in the form of a job offer for a job I was not interviewed for. It was a big undertaking for both my employer and for myself, but in the process it exposed me to many challenges that most developers have no experience of.

It is my belief that my current software development skills have been enriched as a result. I also believe that other software developers would be able to act more effectively with a little investigation around the edges of their own areas of expertise. In this brief article, I hope to encourage you, my readers, to swim a little further out to sea. I hope you find it useful and can make your way back against the tide!

Testing for correctness

Have you heard of integration testing? This is where you test how the software that you and your team have developed interacts with the surrounding eco-system. Perhaps you already do a lot of integration testing. If not, I certainly encourage you to do it, at least in some form. However, if you are performing integration testing without a consideration of the wider eco-system— networks, storage devices, processing platform, operating system options and configuration, drivers, patch level, etc.—then you are leaving open a multitude of untested areas in your product. Whether or not these risks are considered acceptable or not to your business is a matter of judgement, but wouldn’t it be nice to at least smoke test these? Or alternatively, wouldn’t it be nice to know and understand how the test environment is constrained similarly and differently from the real world environment? Not being able to answer these questions means that you won’t be able to tell the IT manager engaged in deploying your software the answers to his justifiable and reasonable questions.

Product ownership and leadership

One of the major failings in the modern world (I have seen it in governments, commercial businesses, charities, churches and social groups) is abdication of ownership.  More generally, I see this as a failing in leadership, but that is a discussion that is probably best conducted in another post.  A failure in ownership means, typically, that important things are left undone, because there is no one that considers it to be their job to ensure that it is done.  Note how I have phrased the last sentence, “to ensure that it is done”.  It is not always the case that the person taking ownership of an activity is the same person responsible for performing the activity.  Nevertheless, not having someone who is willing, able and empowered to take ownership of the activity defeats success almost every time.  We need more people to take ownership of the things for which they have been deemed responsible, or at least been given responsibility for.

In relation to software development, ownership leads to a need for a product team to understand the eco-system in which their product will be deployed.  I mean, to really understand.  I mean to understand to the degree necessary to be able to describe the entirety of the product’s function for which they are responsible.

This means that they should know how to deploy it.  They should know how it has been tested.  They should know how it has been designed.  Importantly, they must, they absolutely must, be able to describe how to use it, in all of its uses, to their user community.  Finally, they must be able to maintain it and support it.

I cannot see that any of this is possible without the team taking a high level of ownership of the entire product for which they are responsible.

Owning a deployment platform

If this level of ownership is to be achieved, then software developers must aim for at least level 3 of the Capability Maturity Model: Defined—a standard business process.  In other words, they must be able to correctly deliver an equivalent configuration based on a repeatable and well-described process.  This is not possible without understanding the configuration of the operating system, and equally of related functional areas.

For an example, a very simple but typical two tier web application deployed on self-hosted hardware would need:

  • Two physical machines
    • One, single disk system with minimal RAM, CPU and memory
    • One, RAID-1 operating system and database log disk; one, RAID-5 array for all other data files
  • One network switch
  • Two operating system installations
  • One web server installation
  • One database server installation
  • One database schema deployment
  • One web application deployment
  • Two firewall configurations
  • Two security configurations
  • etc.

The complexity quickly builds up.  Is there a SAN?  Is there a central network authentication system?  What protocols are being used? UDP? TCP? IPsec? LDAP? Kerberos? SOAP? XMLA?

Taking ownership of a product means taking ownership of the aspects of this that matter to the product’s correct function.  At the very least, it means knowing how to deploy the software and how to configure each part of the overall solution for correct behaviour.  It does not mean understanding everything; it just means understanding enough.

We need software developers to recognise that they develop products deployed into eco-systems, not products acting in isolation (well, at least that is true for most of us).  Therefore, it is critically important that we take responsibility and ownership for what we know and for what we require to be controlled.  We need to improve the quality of software development in the industry.  Would it not help if we at least could describe how the software we were responsible for creating works?

Chrome almost supports SSO in Windows Kerberos environments

I was pleasantly surprised to find that Google Chrome has support for SSO and the Negotiate algorithm. Indeed it also has support for NTLM. So why the need for this post? I think the implementation could do with a little refinement.

Here’s my assumption. Credential delegation in a Kerberos environment is managed by the Kerberos system and its configuration, clients should not attempt to interfere with it. However, Google Chrome disallows ticket forwarding by default, effectively preventing delegation (constrained or otherwise). You can change this with an option on the command line but that means you have to know the option exists and have to plan to change it for every user of your web site. Seems the wrong way round to me. This default means that, out of the box, most web sites of any complexity will not operate as per their intended design.

Secondly, the default SPN behaviour is incorrect for Windows platforms. The Kerberos specification does not say much about SPNs, but they do at least have several parts: the service type, the host and port, and optionally an additional service identifier. Including the port is standard, but Chrome doesn’t do this by default. Secondly, the Chrome default behaviour is to resolve DNS CNAME records to A records and use this for the host part. I can’t fault Google for this approach but it does differ from the widely documented Windows approach of using SPNs for the host header (i.e. before CNAME resolution). (As an aside, note that if you take that approach then why shouldn’t you use the IPv4 address, or the IPv6 address, and what if the machine is multi-homed?). It also interferes with the ability of a host to provide multiple independent services because with the Google approach they all have identical SPNs. In Chrome’s defense, these options can also be controlled via the command line.

Finally, note that NTLMv2 is only available on Windows platforms. Chrome supports NTLMv1 on other platforms but that is horrendously insecure! This is not intended as a negative comment on Chrome, just something to be aware of.

It is great to see other browsers finally supporting SSO, Negotiate, NTLM and Kerberos. I just hope that interoperability is considered a desirable end goal. Without it these are just more competing proprietary solutions, and that would be a shame.

Material about Google Chrome was taken from here: HTTP authentication [The Chromium Projects]. See my recent post about Kerberos in Windows for links to supporting Windows implementation materials.