Syntax highlighting added to Canto34

I’ve just extended my parser toolkit, canto34. Canto 34 is a library for building recursive-descent parsers. It can be used both on node.js, and in the browser as an AMD module.

Starting in v0.0.5, I’ve added a function to let you generate a `.tmLanguage` file. This is the syntax highlighting system of TextMate, Sublime Text, Atom, and Visual Studio code.

This is actually pretty sweet; it makes your language feel a bit more ‘first class’ when you get the richer IDE experience.

You first generate the file content, then save it wherever your text editor demands. For example, To use it with Sublime Text 3 on windows, call code like so, swapping `{myusername}` with the name of your user account;

 var content = canto34.tmLanguage.generateTmLanguageDefinition(lexer);    require(‘fs’).writeFileSync(“C:\\Users\\{myusername}\\AppData\\Roaming\\Sublime Text 3\\Packages\\User\\myparser.tmLanguage”, content);

This will write out a syntax highlighting file into your User folder, which is a personal ‘dumping ground’ for configuration bits and pieces on your machine for Sublime Text. When you restart ST, you’ll see your language in the list of language options, displayed in the bottom-right.

You’ll need to configure the tokens in your lexer a bit more. For example, if you have a variable name token, you’ll need to tell the lexer that this should be highlighted one way, and if you have a keyword like ‘for’, you’ll use another. Do this with the `roles` option on a token type;

    lexer.addTokenType({ name: “comment”, ignore: true, regexp: /^#.*/, role: [“comment”, “line”] });

lexer.addTokenType(types.constant(“for”,”for”, [“keyword”]));

lexer.addTokenType(types.constant(“[“, “openSquare”, [“punctuation”]));

lexer.addTokenType(types.constant(“]”, “closeSquare”, [“punctuation”]));

The list of roles isn’t clear to me – I’ve just figured out some key ones like `keyword`, `comment`, and `punctuation` from reverse engineering existing `tmLanguage` files.

Find out more on GitHub or NPM.

Pretty-printing recordsets from the mssql node package

I’ve been using the node mssql package in node scripts, and it’s great but it doesn’t have a pretty-printer for recordsets; console.dir() will print something out but you get some nasty-looking JSON formatted output.

This function is just a reusable printer for recordsets returned from sql.Request().query() or sql.Request().batch().methods.

Pass in the recordset and a function which prints the line, eg;

printTable(recordset, console.log);

And you get something more famililar with a SQL Server Management Studio feel;

-- SELECT id, name from table
id   name
===========
6    p1
7    p2
8    p3

Here’s the function:

// pretty-print a table returned from mssql node package 
// https://www.npmjs.com/package/mssql
var printTable = function(recordset, printLine) {

 var padEnd = function (str, paddingValue) {
   while(str.length < paddingValue) {
     str = str + " ";
   }
   return str;
 };

 var print = function(value) {
   return (value == undefined || value == null) ? "(null)" : value.toString(); 
 }

 var maxWidth = {};

 for(var c in recordset.columns) {
   maxWidth[c] = c.length;
 }

 var l = recordset.length;
 for(var r = 0; r < l; r++) {
   var row = recordset[r];
   for(var c in recordset.columns) {
     var col = recordset.columns[c];
     row[c] = print(row[c]);
     maxWidth[c] = Math.max(maxWidth[c], row[c].length);
   }
 }

 var head = "";
 for(var c in recordset.columns) {
   var head = head + padEnd(c, maxWidth[c]) + " ";
 }

 printLine(head);

 var sep = Array(head.length).join("=");
 
 printLine(sep);

 for(var r = 0; r < l; r++) {
   var row = recordset[r];
   var printedRow = "";
   for(var c in recordset.columns) {
     printedRow = printedRow + padEnd(row[c], maxWidth[c]) + " ";
   }
   printLine(printedRow);
 }
};

Calculating expected finger table entries for virtual ring distributed systems like Chord and Cassandra

Here’s a block of code I wrote to print out finger tables for Chord-like distributed hash tables. Just update the ‘nodes’ variable and the m-value to adjust to any system. Runs happily in Node.js 4.4.0.

// calculating finger tables for Chord-like distributed hash table.

// the size of the ring, as 2^m;
var m=9, ringSize = Math.pow(2,m);

// the nodes we know about
var nodes = [1, 12, 123, 234, 345, 456, 501]; 

// just make sure it's sorted
nodes = nodes.sort( (a,b) => a-b );

console.log('the finger tables for a ', ringSize,'-virtual-node system');
console.log('========');
console.log('nodes: ', nodes);

// calculate the finger table for each node
nodes.forEach(n => {

 var fingerTable = [];
 i = 1;
 for(var x = 0; x < m; x++) {
 var nodeId = (n + i) % ringSize;
 // find the node after nodeId; 
 var ftEntry = nodes.filter(x => x > nodeId).concat(nodes)[0];
 fingerTable.push(ftEntry);
 i = i << 1;
 }
 
 console.log(n + ':', fingerTable);
});

Fun With ORM Includes!

This is a cautionary tale about how an ORM (in this case, Entity Framework) can screw your performance if you don’t pay attention to the queries it generates. Particularly, the use of recusive includes.

Here’s a simplified version of the original problem;

var person = await DataContext.Set<Person>()
 .Include(p => p.Groups)
 .Include(p => p.Manager)
 .Where(p => p.Id == personId)
 .SingleOrDefaultAsync();

So here’s the intent, and the bug. The code above is supposed to say ‘load a person with the specific ID, and include his/her groups and manager’

And here’s the problem. EF sees that you want to load Person.Manager. Person.Manager is a person, so we’ll need to include the manager’s groups. But ah! Person.Manager has a Manager, which has a Manager, which has a Manager. EF goes batshit at this point, and generates over 90 joins;

person join person join person join person join person ....

And that query goes off, loads far too much data, and takes far too much time; in the order of seconds. Now this is a query that should load in small numbers of milliseconds..

The fix I put in was to divide this into two, much faster queries. Something more like this;

// don't load manager etc.
 var person = await DataContext.Set<Person>()
 .Include(p => p.Groups)
 .Where(p => p.Id == personId)
 .SingleOrDefaultAsync();

if (person.ManagerId.HasValue) {
 var manager = await DataContext.Set<Person>()
 .Include(p => p.Groups)
 .Where(p => p.Id == person.ManagerId.Value)
 .FirstOrDefaultAsync();
 }

So we hit the database twice, but with much cheaper queries. This may not be perfect (I’d like to get everything in one shot) but I couldn’t think of a better approach.

Here’s the takeaway, though. It’s almost never right to use a recursive include (like Person.Manager) unless you want to load complete branches or ancestor paths. When mixed with other includes (like Person.Groups) it leads to an explosion in the complexity of the query. So;

  • Measure the performance of any queries you write. Use the free Stackify Prefix  or some other tool to make sure your queries are small-numbers-of-milliseconds fast. If not, question whether you could make it faster.
  • Use Stackify Prefix to examine the generated SQL. Sure, EF can be a bit verbose, but any time it generates more than a page of SQL, question whether the query is as simple as it could be.
  • Don’t ask for it if you don’t need it. Don’t Include() in data just in case.
  • Two very cheap queries beats one awful one. EF’s only so clever, and it doesn’t handle lots of includes well, so don’t be afraid of two efficient queries when EF goes mental.
  • On the other hand, don’t take that as permission to write an N+1 bug. This is where you load a ‘master’ object, then iterate through children, loading each in turn. This will kill your server, and it’s the most common database performance bug written in enterprise software.

 

Simulating slow POST requests with Fiddler

Sometimes, your dev machine is too damned fast, and you need to slow down how something works in order to investigate a potential bug. This post shows how to do that with a custom rule for Fiddler.

We recently had to make sure that a button on a web page worked like this: the user clicks the button; the button is disabled; the work completes; the button is re-enabled.

However, on a fast developer machine, the whole process can be far too quick. What you need to do is simulate the server taking a long time, so that you can visually confirm the delay.

Turns out you can use Fiddler to do that. It uses a strange JavaScript variant to do it. (Telerik; I’m pretty sure we wouldn’t mind it being plain old JavaScript, if that’s ok!)

Here’s what you need to do;

Start Fiddler.

In the “Rules” menu, choose “Customise Rules…”

About line 65, paste this fragment;

public static RulesOption("Delay &POSTs by 2s", "Per&formance")
var m_DelayPostResponses: boolean = false;

There will be similar pairs of lines thereabouts; just put it with the similar code.

What this does is register a menu option at Rules > Performance > Delay POSTs by 2s. It’ll appear when you save the file.

Then, at the top of the OnBeforeRequest(Session) method around line 162, paste this;

if (m_DelayPostResponses && oSession.HTTPMethodIs("POST")) {
  // active wait loop
  var msecs = 2000;
  var start = new Date().getTime();
  while(new Date().getTime() - start < msecs) { 
  }
}

This introduces a 2-second wait loop for POSTs, if you’ve turned the option on in the Rules > Performance menu.

So once you save this file, fiddler automatically reloads it and the toggling menu item appears. Use this whenever you want to test that your code works around slow POST methods.

Micro-optimisation; ‘starts with single-character string’ vs str[0]

I’m writing code with very stringent performance characteristics, and it’s a string validation routine. One part is a short-cut that kicks in if we might potentially be looking at a JSON object or JSON array. I’d written

var mightBeJson = value.StartsWith("{") || value.StartsWith("[")) {

as a short-cut. Turned out that since this is almost the first thing that happened in the code, this made a massive, terrible hit on performance. Switching to treating the string as a char array makes a massive difference;

 var mightBeJson = value[0] == '[' || value[0] == '{';

Night and day difference!

Takeaway — never compare single-character strings like “X” when you can compare chars like ‘X’ directly.

Setting up log4net to write to trace.axd

I’ve just been looking at logging info from MVC sites through to ASP.NET’s trace.axd handler. Here’s what you need to do.

1) Install log4net from NuGet;

Install-Package log4net

2) Tell log4net to read its config from XML – specifically, from your web.config file;

// Global.asax.cs;
 protected void Application_Start()
 {
   ...
   XmlConfigurator.Configure();
   ...
 }

3) Update web.config with three sections – one to register the log4net element, the log4net config itself, and the ‘trace’ element to turn on asp.net tracing;

web.config, part 1 — register the config section;

<configuration>
  <configSections>
    <section name="log4net" type="log4net.Config.Log4NetConfigurationSectionHandler, log4net" />

web.config, part 2 — add the log4net element under <configuration>;

<log4net debug="true">
  <!-- writes to normal .net trace, eg DBGVIEW.exe, VS Output window -->
  <appender name="TraceAppender" type="log4net.Appender.TraceAppender">
    <layout type="log4net.Layout.PatternLayout">
      <conversionPattern value="%date [%thread] %-5level - %message%newline" />
    </layout>
  </appender>
 
  <!-- writes to ASP.NET Tracing, eg Trace.axd -->
  <appender name="AspNetTraceAppender" type="log4net.Appender.AspNetTraceAppender" >
    <layout type="log4net.Layout.PatternLayout">
      <conversionPattern value="%date [%thread] %-5level %logger [%property{NDC}] - %message%newline" />
    </layout>
  </appender>
 
  <!-- writes errors to the event log -->
  <appender name="EventLogAppender" type="log4net.Appender.EventLogAppender">
    <applicationName value="AI Track Record" />
    <threshold value="ERROR" />
    <layout type="log4net.Layout.PatternLayout">
     <conversionPattern value="%date [%thread] %-5level %logger [%property{NDC}] - %message%newline" />
    </layout>
  </appender>
 
  <root>
    <level value="ALL" />
    <appender-ref ref="AspNetTraceAppender" />
    <appender-ref ref="EventLogAppender" />
    <appender-ref ref="TraceAppender" />
  </root>
 
</log4net>

web.config, part 3 – enable trace element in <configuration> / <system.web>

 
<configuration>
  <system.web>
    <trace writeToDiagnosticsTrace="true" enabled="true" pageOutput="false" requestLimit="250" mostRecent="true" />
  <system.web>
</configuration>

4) Finally, to actually use the log, you need to create a logger;

// maybe add this to each class, or to be fancy, inject using Ninject;
private static readonly ILog log = LogManager.GetLogger(typeof(MyClassName));

and write to it using methods like Info, Debug, etc;

log.Debug("Here's my excellent message");

5) Check Trace.axd! You’ll see your messages written to the ‘Trace Information’ section

Summary

Things to note — each appender has its own format, in the <converstionPattern> element, and its own ‘seriousness’, in the <threshold> element. So you can set up fine-grained control, like logging absolutely everything to a file and only errors to event log, or writing sparse information to trace.axd while writing verbose messages to a database table.