The Nexen Balzac gas plant just north of Calgary had a Provox control system installed in 1988, the year I started at Spartan. This customer had a service agreement with us from that day on and I took good care of them, despite the serious corrosion problem they had at site.
In 2001, I recommended they get into our new DeltaV control system, stating that the DEC equipment they were using was not improving anymore and the Provox itself was a mature product. This meant more problems, more costs. I had at first suggested just the front end of DeltaV, but when I considered all the problems they had with corroded Provox, I told them the entire system should be replaced. I got the head sales guy to do a presentation which was absolutely deplorable: full of slides that didn't apply, out of sequence. I was embarrassed.
Fortunately they trusted me and we had a new hire by the name of Patrick who took this on as a project. He is brilliant and did a fantastic job of engineering. I helped with hardware, power and grounding, some Scada configuration, and the Provox stuff.
During the big shutdown of 2001 there were a lot of other things to do but the shutdown coordinators were worried that the conversion we were doing would hold them back. Ha! In 5 days, we removed all the Provox cabinets, furniture, and electronics and installed DeltaV and had it running and commissioned! Electricians worked 24 hours , 4 guys to a node (Treater, BoilerHouse) to get the wiring done, Pat concentrated on configuration and had Darren and Vern (Nexen Instrument Techs) helping out. We had 4 commissioning teams of 4 guys each to test from graphic to end device. We were done in June, the plant was very late starting in July. Their 10 day turnaround was about 30 days long.
After startup I started getting calls about strange things happening at the plant, where data values on the graphics the operators were looking at were replaced with "&" symbols. Sometimes the proper data would come back, other times it would just "hang up". The solution was to remove the controller in question, then put it back in and that would fix it when there was a hang up. This began to concern the customer very much because while pulling out one controller was fine, leaving the backup controller to take over control, what would happen if both failed?
I spent many days there trying to figure this out. I got Emerson (the makers of DeltaV control system) involved and the first thing we did was look at wiring. This is when I discovered the Ethernet wiring, though fine by communication standards for office wiring, was off the mark for industrial installations where the wire had to be shielded. I told them they had to replace all their wiring, after showing them the data I obtained from my test equipment. Two thousand dollars later, they were still having issues. Oh well…
I started looking elsewhere: grounding, power, lightning strikes, proximity to power plant, etc. Emerson sent me 8 replacement controllers - no effect. I got the power and grounding expert from Austin to come up: nothing out of the ordinary. On Dec 1, 2001, we began to man the plant 24/7. To do that, every available technical Spartan had to serve a stint at the plant, waiting for the event to occur, then reset the controller. I stayed on day shift to coordinate all this and maintain communication with Emerson who I was in contact with daily. We had to use my home email account which I could access from Nexen, to move emails and hot fixes back and forth. We got the controller expert in from Emerson for a week or two and he did some fancy diagnostics to no avail. He left but came back with another guy, a programmer/developer and did more analysis. Still nothing. I had full support from Spartan and whenever I wanted something I got it. I wanted a scope, I got it. I wanted 5 laptops, I got them. Twice a week Rick Anderson, our VP would show up and talk to management and to operators trying to assure them that we would fix the problem. That's right, the Vice President showing up to show his concern and support.
One day it happened, the active and the standby controllers both went "tits-up" at the same time. The configuration was lost and the plant was not under any control, though all the outputs like control valves were holding their last value through the i/o cards. To recover we would have to download the controller, which in itself is always a bad thing because it will cause control valves to slam shut or full open, pumps to start or stop, whatever the conditions are that we consider "base" conditions. It does NOT represent how the plant was running, although regular uploads to the servers can fix that problem. I indicated we were hooped and had to download. While every one was scurrying around trying to figure out what would be affected and how it would be affected, time was running out to try and match the control parameters that were last in the controllers. If we had downloaded immediately we would be pretty close to matching how the plant was running. As we delayed more and more, processes changed, tanks were filling, and pressures were building. An HOUR later I was told to go ahead. I hit the download button and immediately the plant went into flare, that is, an ESD or emergency shut down. This is a safety condition where all the gas in the plant is directed to a large stack with a pilot light on top. Millions of cubic feet of gas are burned off as a safety precaution. Wow! The stack is 150 feet tall and the flame was taller than that! And wide! There were many calls to fire departments and the media and the situation was displayed on the television news because the flame could be seen in Calgary, 30 minutes away. I remember our president Mike Begin saying to himself as he was driving on the highway towards Calgary, "I hope that's not one of our plants". Well it was. I stood outside and took pictures as all the horrified workers scrambled to get the plant lined out. There were 10 video monitors at 5 stations in the control room, and every one of the 10 spots was manned.
I worked every day shift in December, mostly 12 hours, but some 16s, and another week or so in January. On Christmas Day, I worked 16 hours with my boss Bill Elliott in Edmonton coming to relieve me at midnight. He drove all the way to Calgary on Christmas Day to take over. I invited Barbara over and we played Scrabble in the back room and then at dinner time we joined the 10 operators for a feast that they made: barbecued beef roast, baked buns, hand-squeezed orange juice and other trimmings. I bought a couple bags of food: non-alcoholic drinks, deserts, fruit. It was a memorable Christmas.
We had supervisors doing a shift or two, and guys from Edmonton came in as well. The project manager for the DeltaV installation, Steve Herbert showed up every day on his way home to Airdrie. Usually he would bring me some test equipment (the scope, the 5 laptops), but he also wanted to stay in the loop as long because it was his project. I was sending out an email every couple days to Rick Anderson, guys at Emerson, and my boss Bill Elliott. John Chipps was particularly helpful getting man-power - it was his insistence that even guys like Barry Blight (manager) put in a shift.
Kevan Kobasiuk made one of the biggest sacrifices, getting here from Fort St. John BC. At first he tried to fly in an ice storm but all flights were cancelled, so he drove from Fort St. John to Grande Prairie, a one hour drive. It took him about 4 hours, at 5 mph on the icy roads. Then he drove the entire distance from Grande Prairie to Calgary. He arrived after 16 hours and insisted he work his night shift, relieving me in the evening.
I was told by Rick Anderson that a real big Nexen project up north was being delayed for the control system spec'ing until we resolved this problem at the Balzac plant. Nexen wanted DeltaV but not if it was going to be a problem. The fact that we manned the plant 24/7 and got guys from Austin in, really worked in our favour. (Nexen Long Lake went ahead with DeltaV after we got this problem resolved).
I worked on Boxing Day, New Year's Eve and New Year's Day. We had a laptop at each of 5 nodes, sniffing the Ethernet communication, waiting for an event so we could compare data SENT with date RECEIVED. Every time there was an "event", I would capture the files from the 5 laptops and send them out by email to Emerson in Austin. Austin regularly shut down between Christmas and New Year's but not this year: all holidays were cancelled and 6 guys had to be in the office to help out with resolving this problem. Talk about a total team effort! The only PERSON who did not cooperate and was just a pain in the ass was one of the instrument guys at the plant. He would always leave early, wasn't interested in helping in any way (unlike Vern, who did help), and took all of Christmas holidays off.
A few days into the New Year, 2002, Emerson tech support said they found the problem. They had been able to sync up the messages from the various computers and compare notes, finding that what was sent from the controller in the back room was not what was received at the other end, arriving there all garbled. The only thing between the two was a 3Com Ethernet switch. When Emerson contacted 3Com - at the highest levels - they found out that 3Com KNEW about a problem, had a patch for it, but didn't advertise it as a mandatory fix. Obviously they thought these things were tied to printers or something else not as critical as a controller in a gas plant.
Within 1 month or so, Emerson stated we would no longer be using 3Com switches and switched to Cisco instead. I downloaded the patch and there was not one problem to the day the 3Coms were finally replaced in summer of 2008.
The service manager for Nexen came to Spartan to make a presentation and thank all those involved for their contribution. He saved lots of nice words for me and sent me a letter. (I had to be at another customer site the day he came, so I heard this from others.) Bill, my boss, gave me overtime for all those days and told me to take time off. I didn't need the time off...there was other work to do. I took the money though.