Reverse-engineering and analysis of SanDisk High Endurance microSDXC card

As seen on Hackaday!

TL;DR – The SanDisk High Endurance cards use SanDisk/Toshiba 3D TLC Flash. It took way, way more work than it should have to figure this out (thanks for nothing, SanDisk!).

In a previous blog post, I took a look at SanDisk’s microSD cards that were aimed for use in write-intensive applications like dash cameras. In that post I took a look at its performance metrics, and speculated about what sort of NAND Flash memory is used inside. SanDisk doesn’t publish any detailed specifications about the cards’ internal workings, so that means I have no choice but to reverse-engineer the can of worms card myself.

The Help Desk That Couldn’t

In the hopes of uncovering more information, I sent an email to SanDisk’s support team asking about what type of NAND Flash they are using in their High Endurance lineup of cards, alongside endurance metrics like P/E (Program/Erase) cycle counts and total terabytes written (TBW). Unfortunately, the SanDisk support rep couldn’t provide a satisfactory answer to my questions, as they’re not able to provide any information that’s not listed in their public spec sheets. They said that all of their cards use MLC Flash, which I guess is correct if you call TLC Flash 3-bit MLC (which Samsung does).

Dear Jason,

Thank you for contacting SanDisk® Global customer care. We really appreciate you being a part of our SanDisk® family.

I understand that you wish to know more about the SanDisk® High Endurance video monitoring card, as such please allow me to inform you that all our SanDisk® memory cards uses Multi level cell technology (MLC) flash. However, the read/write cycles for the flash memory is not published nor documented only the read and write speed in published as such they are 100 MB/S & 40 MB/s. The 64 GB card can record Full HD video up to 10,000 hours. To know more about the card you may refer to the link below:

SANDISK HIGH ENDURANCE VIDEO MONITORING microSD CARD

Best regards,
Allan B.
SanDisk® Global Customer Care

I’ll give them a silver star that says “You Tried” at least.

Anatomy of an SD Card

While (micro)SD cards feel like a solid monolithic piece of technology, they’re made up of multiple different chips, each performing a different role. A basic SD card will have a controller that manages the NAND Flash chips and communicates with the host (PC, camera, etc.), and the NAND Flash itself (made up of 1 or more Flash dies). Bunnie Huang’s blog, Bunnie Studios, has an excellent article on the internals of SD cards, including counterfeits and how they’re made – check it out here.

SD Card Anatomy

Block diagram of a typical SD card.

MicroSD cards often (but not always!) include test pads, used to program/test the NAND Flash during manufacture. These can be exploited in the case of data recovery, or to reuse microSD cards that have a defective controller or firmware by turning the card into a piece of raw NAND Flash – check out Gough Lui’s adventures here. Note that there is no set standard for these test pads (even for the same manufacturer!), but there are common patterns for some manufacturers like SanDisk that make reverse-engineering easier.

Crouching Controller, Hidden Test Pads

microSD cards fall into a category of “monolithic” Flash devices, as they combine a controller and raw NAND Flash memory into a single, inseparable package. Many manufacturers break out the Flash’s data bus onto hidden (and nearly completely undocumented) test pads, which some other memory card and USB drive manufacturers take advantage of to make cheap storage products using failed parts; the controller can simply be electrically disabled and the Flash is then used as if it were a regular chip.

In the case of SanDisk cards, there is very limited information on their cards’ test pad pinouts. Each generation has slight differences, but the layout is mostly the same. These differences can be fatal, as the power and ground connections are sometimes reversed (this spells instant death for a chip if its power polarity is mixed up!).

CORRECTION (July 22, 2020): Upon further review, I might have accidentally created a discrepancy between the leaked pinouts online, versus my own documentation in terms of power polarity; see the “Test Pad Pinout” section.

My card (and many of their higher-end cards – that is, not their Ultra lineup) features test pads that aren’t covered by solder mask, but are instead covered by some sort of screen-printed epoxy with a laser-etched serial number on top. With a bit of heat and some scraping, I was able to remove the (very brittle) coating on top of the test pads; this also removed the serial number which I believe is an anti-tamper measure by SanDisk.

After cleaning off any last traces of the epoxy coating, I was greeted with the familiar SanDisk test pad layout, plus a few extra on the bottom.

Building the Breakout Board

The breakout board is relatively simple in concept: for each test pad, bring out a wire that goes to a bigger test point for easier access, and wire up the normal SD bus to an SD connector to let the controller do its work with twiddling the NAND Flash bus. Given how small each test pad is (and how many), things get a bit… messy.

I started by using double-side foam adhesive tape to secure the SD card to a piece of perfboard. I then tinned all of the pads and soldered a small 1uF ceramic capacitor across the card’s power (Vcc) and ground (GND) test pads. Using 40-gauge (0.1 mm, or 4-thousandths of an inch!) magnet wire, I mapped each test pad to its corresponding machine-pin socket on the perfboard. Including the extra test pads, that’s a total of 28 tiny wires!

For the SD connector side of things, I used a flex cable for the XTC 2 Clip (a tool used to service HTC Android devices), as it acted as a flexible “remote SD card” and broke out the signals to a small ribbon cable. I encased the flex cable with copper tape to act as a shield against electrical noise and to provide physical reinforcement, and soldered the tape to the outer pads on the perfboard for reinforcement. The ribbon cable end was then tinned and each SD card’s pin was wired up with magnet wire. The power lines were then broken out to an LED and resistor to indicate when the card was receiving power.

Bus Analysis

With all of the test pads broken out to an array of test pins, it was time to make sense of what pins are responsible for accessing the NAND Flash inside the card.

Test Pad Pinout

Diagram of the test pads on SanDisk's High Endurance microSD card.

Diagram of the test pads on SanDisk’s High Endurance microSD card. (click to enlarge)

The overall test pad pinout was the same for other microSD cards from SanDisk, but there were some differences, primarily regarding the layout of the power pads; notably, the main power pins are backwards! This can destroy the card if you’re not careful when applying power.

CORRECTION (July 22, 2020): I might actually have just gotten my own documentation mixed up in terms of the power and ground test pads. Regardless, one should always be careful to ensure the correct power polarity is sent to a device.

I used my DSLogic Plus logic analyzer to analyze the signals on all of the pins. Since the data pinout was previously discovered, the hard part of figuring out what each line represented (data bus, control, address, command, write-protect, ready/busy status) was already done for me. However, not all of the lines were immediately evident as the pinouts I found online only included the bare minimum of lines to make the NAND Flash accessible, with one exception being a control line that places the controller into a reset state and releases its control of the data lines (this will be important later on).

By sniffing the data bus at the DSLogic’s maximum speed (and using its 32 MB onboard buffer RAM), I was able to get a clear snapshot of the commands being sent to the NAND Flash from the controller during initialization.

Bus Sniffing & NAND I/O 101 (writing commands, address, reading data)

In particular, I was looking for two commands: RESET (0xFF), and READ ID (0x90). When looking for a command sequence, it’s important to know how and when the data and control lines change. I will try to explain it step-by-step, but if you’re interested there is an introductory white paper by Micron that explains all of the fundamentals of NAND Flash with much more information about how NAND Flash works.

When a RESET command is sent to the NAND Flash, first the /CE (Chip Select, Active Low) line is pulled low. Then the CLE (Command Latch Enable) line is pulled high; the data bus is set to its intended value of 0xFF (all binary ones); then the /WE (Write Enable, Active Low) line is pulsed from high to low, then back to high again (the data bus’ contents are committed to the chip when the /WE line goes from low to high, known as a “rising edge”); the CLE line is pulled back low to return to its normal state. The Flash chip will then pull its R/B (Ready/Busy Status) line low to indicate it is busy resetting itself, then releases the line back to its high state when it’s finished.

The READ ID command works similarly, except after writing the command with 0x90 (binary 1001 0000) on the data bus, it then pulls the ALE (Address Latch Enable) line high instead of CLE, and writes 0x00 (all binary zeroes) by pulsing the /WE line low. The chip transfers its internally programmed NAND Flash ID into its internal read register, and the data is read out from the device on each rising edge of the /RE (Read Enable, Active Low) line; for most devices this is 4 to 8 bytes of data.

NAND Flash ID

For each NAND Flash device, it has a (mostly) unique ID that identifies the manufacturer, and other functional data that is defined by that manufacturer; in other words, only the manufacturer ID, assigned by the JEDEC Technology Association, is well-defined.

The first byte represents the Flash manufacturer, and the rest (2 to 6 bytes) define the device’s characteristics, as set out by the manufacturer themselves. Most NAND vendors are very tight-lipped when it comes to datasheets, and SanDisk (and by extension, Toshiba/Kioxia) maintain very strict control, save for some slightly outdated leaked Toshiba datasheets. Because the two aforementioned companies share their NAND fabrication facilities, we can reasonably presume the data structures in the vendor-defined bytes can be referenced against each other.

In the case of the SanDisk High Endurance 128GB card, it has a NAND Flash ID of 0x45 48 9A B3 7E 72 0D 0E. Some of these values can be compared against a Toshiba datasheet:

Byte Value (Hex) Description/Interpretation
45
  • Manufacturer: SanDisk
48
  • I/O voltage: Presumed 1.8 volts (measured with multimeter)
  • Device capacity: Presumed 128 GB (unable to confirm against datasheet)
9A
  • NAND type: TLC (Triple-Level Cell / 3 bits per cell)
  • Flash dies per /CE: 4 (card uses four 32GB Flash chips internally)
B3
  • Block size: 12 MiB (768 pages per block) excluding spare area (determined outside datasheet)
  • Page size: 16,384 bytes / 16 kiB excluding spare area
7E
  • Planes per /CE: 8 (2 planes per die)
72
  • Interface type: Asynchronous
  • Process geometry: BiCS3 3D NAND (determined outside datasheet)
0D
  • Unknown (no information listed in datasheet)
0E
  • Unknown (no information listed in datasheet)

Although not all byte values could be conclusively determined, I was able to determine that the SanDisk High Endurance cards use BiCS3 3D TLC NAND Flash, but at least it is 3D NAND which improves endurance dramatically compared to traditional/planar NAND. Unfortunately, from this information alone, I cannot determine whether the card’s controller takes advantage of any SLC caching mechanisms for write operations.

The chip’s process geometry was determined by looking up the first four bytes of the Flash ID, and cross-referencing it to a line from a configuration file in Silicon Motion’s mass production tools for their SM3271 USB Flash drive controller, and their SM2258XT DRAM-less SSD controller. These tools revealed supposed part numbers of SDTNAIAMA-256G and SDUNBIEMM-32G respectively, but I don’t think this is accurate for the specific Flash configuration in this card.

External Control

I wanted to make sure that I was getting the correct ID from the NAND Flash, so I rigged up a Texas Instruments MSP430FR2433 development board and wrote some (very) rudimentary code to send the required RESET and READ ID commands, and attempt to extract any extra data from the chip’s hidden JEDEC Parameter Page along the way.

My first roadblock was that the MSP430 would reset every time it attempted to send the RESET command, suggesting that too much current was being drawn from the MSP430 board’s limited power supply. This can occur during bus contention, where two devices “fight” each other when trying to set a certain digital line both high and low at the same time. I was unsure what was going on, since publicly-available information didn’t mention how to disable the card’s built-in controller (doing so would cause it to tri-state the lines, effectively “letting go” of the NAND bus and allowing another device to take control).

I figured out that the A1 test pad (see diagram) was the controller’s reset line (pulsing this line low while the card was running forced my card reader to power-cycle it), and by holding the line low, the controller would release its control of the NAND Flash bus entirely. After this, my microcontroller code was able to read the Flash ID correctly and consistently.

Reading out the card's Flash ID with my own microcontroller code.

Reading out the card’s Flash ID with my own microcontroller code.

JEDEC Parameter Page… or at least what SanDisk made of it!

The JEDEC Parameter Page, if present, contains detailed information on a Flash chip’s characteristics with far greater detail than the NAND Flash ID – and it’s well-standardized so parsing it would be far easier. However, it turns out that SanDisk decided to ignore the standard format, and decided to use their own proprietary Parameter Page format! Normally the page starts with the ASCII string “JEDEC”, but I got a repeating string of “SNDK” (corresponding with their stock symbol) with other data that didn’t correspond to anything like the JEDEC specification! Oh well, it was worth a try.

I collected the data with the same Arduino sketch as shown above, and pulled 1,536 bytes’ worth of data; I wrote a quick program on Ideone to provide a nicely-formatted hex dump of the first 512 bytes of the Parameter Page data:

Offset 00:01:02:03:04:05:06:07:08:09:0A:0B:0C:0D:0E:0F 0123456789ABCDEF
------ --+--+--+--+--+--+--+--+--+--+--+--+--+--+--+-- ----------------
0x0000 53 4E 44 4B 53 4E 44 4B 53 4E 44 4B 53 4E 44 4B SNDKSNDKSNDKSNDK
0x0010 53 4E 44 4B 53 4E 44 4B 53 4E 44 4B 53 4E 44 4B SNDKSNDKSNDKSNDK
0x0020 08 08 00 08 06 20 00 02 01 48 9A B3 00 05 08 41 ..... ...H.....A
0x0030 48 63 6A 08 08 00 08 06 20 00 02 01 48 9A B3 00 Hcj..... ...H...
0x0040 05 08 41 48 63 6A 08 08 00 08 06 20 00 02 01 48 ..AHcj..... ...H
0x0050 9A B3 00 05 08 41 48 63 6A 08 08 00 08 06 20 00 .....AHcj..... .
0x0060 02 01 48 9A B3 00 05 08 41 48 63 6A 08 08 00 08 ..H.....AHcj....
0x0070 06 20 00 02 01 48 9A B3 00 05 08 41 48 63 6A 08 . ...H.....AHcj.
0x0080 08 00 08 06 20 00 02 01 48 9A B3 00 05 08 41 48 .... ...H.....AH
0x0090 63 6A 08 08 00 08 06 20 00 02 01 48 9A B3 00 05 cj..... ...H....
0x00A0 08 41 48 63 6A 08 08 00 08 06 20 00 02 01 48 9A .AHcj..... ...H.
0x00B0 B3 00 05 08 41 48 63 6A 08 08 00 08 06 20 00 02 ....AHcj..... ..
0x00C0 01 48 9A B3 00 05 08 41 48 63 6A 08 08 00 08 06 .H.....AHcj.....
0x00D0 20 00 02 01 48 9A B3 00 05 08 41 48 63 6A 08 08  ...H.....AHcj..
0x00E0 00 08 06 20 00 02 01 48 9A B3 00 05 08 41 48 63 ... ...H.....AHc
0x00F0 6A 08 08 00 08 06 20 00 02 01 48 9A B3 00 05 08 j..... ...H.....
0x0100 41 48 63 6A 08 08 00 08 06 20 00 02 01 48 9A B3 AHcj..... ...H..
0x0110 00 05 08 41 48 63 6A 08 08 00 08 06 20 00 02 01 ...AHcj..... ...
0x0120 48 9A B3 00 05 08 41 48 63 6A 08 08 00 08 06 20 H.....AHcj..... 
0x0130 00 02 01 48 9A B3 00 05 08 41 48 63 6A 08 08 00 ...H.....AHcj...
0x0140 08 06 20 00 02 01 48 9A B3 00 05 08 41 48 63 6A .. ...H.....AHcj
0x0150 08 08 00 08 06 20 00 02 01 48 9A B3 00 05 08 41 ..... ...H.....A
0x0160 48 63 6A 08 08 00 08 06 20 00 02 01 48 9A B3 00 Hcj..... ...H...
0x0170 05 08 41 48 63 6A 08 08 00 08 06 20 00 02 01 48 ..AHcj..... ...H
0x0180 9A B3 00 05 08 41 48 63 6A 08 08 00 08 06 20 00 .....AHcj..... .
0x0190 02 01 48 9A B3 00 05 08 41 48 63 6A 08 08 00 08 ..H.....AHcj....
0x01A0 06 20 00 02 01 48 9A B3 00 05 08 41 48 63 6A 08 . ...H.....AHcj.
0x01B0 08 00 08 06 20 00 02 01 48 9A B3 00 05 08 41 48 .... ...H.....AH
0x01C0 63 6A 08 08 00 08 06 20 00 02 01 48 9A B3 00 05 cj..... ...H....
0x01D0 08 41 48 63 6A 08 08 00 08 06 20 00 02 01 48 9A .AHcj..... ...H.
0x01E0 B3 00 05 08 41 48 63 6A 08 08 00 08 06 20 00 02 ....AHcj..... ..
0x01F0 01 48 9A B3 00 05 08 41 48 63 6A 08 08 00 08 06 .H.....AHcj.....

Further analysis with my DSLogic showed that the controller itself requests a total of 4,128 bytes (4 kiB + 32 bytes) of Parameter Page data, which is filled with the same repeating data as shown above.

Reset Quirks

When looking at the logic analyzer data, I noticed that the controller sends the READ ID command twice, but the first time it does so without resetting the Flash (which should normally be done as soon as the chip is powered up!). The data that the Flash returned was… strange to say the least.

Byte Value (Hex) Interpreted Value
98
  • Manufacturer: Toshiba
00
  • I/O voltage: Unknown (no data)
  • Device capacity: Unknown (no data)
90
  • NAND type: SLC (Single-Level Cell / 1 bit per cell)
  • Flash dies per /CE: 1
93
  • Block size: 4 MB excluding spare area
  • Page size: 16,384 bytes / 16 kiB excluding spare area
76
  • Planes per /CE: 2
72
  • Interface type: Asynchronous
  • Process geometry: 70 nm planar

This confused me initially when I was trying to find the ID from the logic capture alone; after talking to a contact who has experience in NAND Flash data recovery, they said this is expected for SanDisk devices, which make liberal use of vendor-proprietary commands and data structures. If the fourth byte is to be believed, it says the block size is 4 megabytes, which I think is plausible for a modern Flash device. The rest of the information doesn’t really make any sense to me apart from the first byte indicating the chip is made by Toshiba.

Conclusion

I shouldn’t have to go this far in hardware reverse-engineering to just ask a simple question of what Flash SanDisk used in their high-endurance card. You’d think they would be proud to say they use 3D NAND for higher endurance and reliability, but I guess not!

Downloads

For those that are interested, I’ve included the logic captures of the card’s activity shortly after power-up. I’ve also included the (very crude) Arduino sketch that I used to read the NAND ID and Parameter Page data manually:

Recovering the SIM card PIN from the ZTE WF721 cellular home phone

As seen on Hackaday!

TL;DR – If you have a ZTE WF721 that’s PIN-locked your SIM card, try 2376.

Recently I picked up a used Samsung Galaxy Core LTE smartphone from a relative after they upgraded to an iPhone. As the Core LTE is a low-end smartphone, I suspected that the phone was SIM locked to its original carrier (Virgin Mobile), but in order to test this I needed a different SIM card. My personal phone was on the same carrier, so I wasn’t able to use that to test it.

However, the previous summer I picked up a ZTE WF721 cellular home phone base station (that is, it’s a voice-only cell phone that a landline phone plugs into), which came with a Telus SIM card. The issue is that the WF721 sets a SIM card PIN to essentially “lock” the card to the base station, and it wasn’t the default 1234 PIN; brute-forcing a SIM card is not possible as you get 3-5 attempts before the card needs to be unblocked using a PUK (PIN Unblock Key), failing that, the card is permanently rendered unusable. I decided to take the base station apart, and use my knowledge in electronics and previous research into smart cards to see if I could recover the PIN.

(Yes, I went through all this work instead of just buying a prepaid SIM card from the dollar store. I’m weird like that.)

Test Pads & Signals

After a bit of disassembly work involving removing screws hidden under rubber non-slip feet and a lot of spudgering open plastic clips, I got access to the four test pads that connect to the SIM card, accessible on the opposite side of the PCB from the SIM card socket.

ZTE WF721 Opened

The ZTE WF721 opened, with test pads broken out and connected to DSLogic for reverse engineering.

An ISO 7816-compliant smart card (and a SIM card is one) require 5 different lines to work: Vcc (power), ground, clock, I/O (data), and reset. The I/O is an asynchronous half-duplex UART-type interface, whose baud rate is determined by the card’s characteristics and the clock frequency that it is given by the reader (in this case, the WF721). The details of how the interface work can be obtained for free in their TS 102 221 specification from the ETSI (European Telecommunications Standards Institute).

ZTE WF721 SIM Card Test Pads

The test pads that connect to the WF721’s SIM card socket.

I then soldered the ground wire to a free test pad elsewhere on the board, whereas the four other wires were soldered to the test pads near the SIM card socket. I then connected these wires to a pin header and plugged it into my DSLogic Plus logic analyzer. I analyzed the logic captures after turning the WF721 on and allowing it to initialize the SIM card and attempt to connect to the cellular network (the service to it has been disconnected so it doesn’t actually succeed).

Command Analysis

After looking at the raw logic capture, there was a lot that I had to sift through. I needed to create a custom setting for the UART decoder as the serial output isn’t your traditional “9600-8-N-1” setting. Rather, the interface uses 8 data bits, even parity, and 2 stop bits. The baud rate is determined by a parameter in the card’s initial identification, the ATR (Answer to Reset). I parsed the card’s ATR that I previously captured on the PC using the SpringCard PC/SC Diagnostic tool using Ludovic Rousseau’s online tool, I determined I needed to use a baud rate of 250 kbit/s, since the card was being fed a 4 MHz clock.

T=0 Smart Card Command (APDU) Structure

The smart card communicates to and from the host through APDUs (Application Protocol Data Units). The command header for a T=0 smart card (character-based I/O, which most cards use) is made up of 5 bytes: class, instruction, 2 parameter bytes and a length/3rd parameter byte. To acknowledge the command, the card sends the instruction byte back to the reader and the data is transferred to/from the card, depending on the command used. The card then sends two status bytes that indicate whether the command is successful; if it is, the response is 0x9000. A graphical representation of this process can be seen in the next section.

VERIFY PIN Command Decoding

The raw data structure of a SIM card's VERIFY PIN command. Each part of the flow is labeled for ease of understanding.

The raw data structure of a SIM card’s VERIFY PIN command. Each part of the flow is labeled for ease of understanding (click image to see full size).

The command I’m looking for is 0x20 (VERIFY PIN), and I had to sift through the command flow in the logic analyzer until I found it. After a lot of preceding commands, I found the command I was looking for, and I found the PIN… and it’s in plaintext! As it turns out, it is sent as an ASCII string, but it’s not null terminated like a regular string. Instead, the data is always 8 bytes (allowing up to an 8-digit PIN), but a PIN shorter than 8 digits will have the end bytes padded with 0xFF (all binary ones). It was easy to determine that the bytes 0x32 33 37 36 is the ASCII representation of the PIN 2376, and after the card waited many tens of milliseconds, it acknowledged the PIN was correct as it gave the expected 0x9000 response code.

PIN Testing & Unlocking

SIM Opened in Dekart SIM Manager

Dekart SIM Manager showing the phone number programmed into the SIM card (censored for privacy).

I tried the PIN in the Dekart SIM Manager software on my computer, and it worked! I was able to read out the contents of the SIM and find out what phone number it used to have, although no other useful information was found.

By using the legacy GSM class 0xA0, I was able to manually verify the PIN by directly communicating with the SIM card using the same command syntax in PC/SC Diagnostic:

SIM Card VERIFY PIN Test

Testing the VERIFY PIN command directly in SpringCard PC/SC Diagnostic.

I took the SIM card out and put it in my Galaxy Core LTE phone, entered the PIN, and as expected it brought up the network unlock prompt. I was able to contact my carrier to get the phone unlocked, and they did it for free (as legally required in Canada) – it turned out to be helpful I was on the same network, as they needed an account to authenticate the request against, even if the phone is registered to another account holder. After entering the 8-digit unlock PIN they provided, the phone was successfully unlocked!

The WF721 is in all likelihood also network locked, but that’s a bridge I haven’t crossed yet.

Conclusion

After a bit of sleuthing into how SIM cards communicate with a cell phone, I was able to decipher the exact command used to authenticate a SIM card PIN inside a disused cellular home phone, all to check if a hand-me-down smartphone was network-locked to its original carrier. Was it a lot of effort just to do that? Maybe, but where’s the fun in buying a prepaid SIM card? 🙂

Quick Hack: Converting a computer fan from thermostatic to PWM control

As seen on Hackaday!

About a week ago I needed to replace the CPU fan in my home server as it was running slower than it used to. The Cooler Master Vortex Plus that I chose for my home server uses a standard 92mm fan, and uses the 4-pin connector standard to provide tachometer (speed) readout and PWM speed control.

The Vortex Plus fan’s sleeve bearing was proving to be the weak point of the cooler, and after many, many years of continuous operation, the bearings had lost lubrication and worn themselves down. I had another 92mm fan in my scrap bin, the Nidec TA350DC, but this would prove to be a challenge to adapt it for use in a normal computer system. This fan came from an old Dell Optiplex desktop and used a proprietary 3-pin connector (therefore there was no PWM control), and it was thermostatically-controlled. The fan used a 10kOhm NTC thermistor to measure the airflow temperature, and would increase its speed as the temperature increased (and therefore the thermistor’s resistance decreased). This would prove to be a challenge with implementing that fan as a CPU cooler, as the motherboard uses a PWM (pulse-width modulation) signal to control the fan speed. My proposed solution was to take advantage of the low-current thermostatic control circuitry and effectively override the fan’s own autonomous control system, as opposed to forcing the fan to run at full speed and using high-current MOSFETs to PWM the fan’s power supply, as I felt that doing so could disrupt the fan’s tachometer signal to the motherboard.

PWM Mod Circuit

I used the existing thermostatic control circuit to my advantage, since the thermistor forms the low side of a voltage divider. All I needed to do was use an N-channel MOSFET (specifically, the 2N7002) to short the thermistor pins when the FET’s gate terminal is pulled high, and I swapped the thermistor with a plain 10 kOhm resistor to effectively disable the fan’s autonomous control. I presumed the tachometer signal should be compatible with existing motherboards, and therefore not require any modifications.

As per the PWM fan control specifications, the speed control signal is a 5-volt digital signal, with a frequency of approximately 25 kHz and a variable duty cycle of 30-100%, and is a non-inverting signal. This is especially convenient as this means I don’t need to invert the logic signal before feeding it to the N-channel MOSFET controlling the thermistor input circuit. I did need to protect the gate from ESD (electrostatic discharge) damage, as the gate can only handle 20-30 volts before the gate’s microscopically thin insulation breaks down, rendering it useless. I used a BZX84 5.1-volt Zener diode to act as ESD and overvoltage protection. In the end, my assembled circuit board was actually slightly shorter than the thermistor it replaced!

Conclusion

After all this, I had a fan that would accept a PWM control signal and had at least some control over its fan speed. However, I later realized that the tachometer signal was not working, causing my motherboard to report that the fan had failed. At this point I didn’t really want to come up with another circuit (perhaps a Hall effect sensor) to sense the fan’s speed, so I simply took the easy way out and just disabled the warning in Intel Desktop Utilities 🙂 . I might revisit this mod sometime in the future if I need to do this again.

 

 

Hacking into Windows CE (and Doom) on the Magellan RoadMate 1412 GPS receiver

As seen on Hackaday!

Oh no, I’ve done it again haven’t I?

TL;DR: Just because a device runs Windows CE doesn’t mean it’ll just run Doom without much work.

About a week ago I was perusing some local garage sales, and stumbled upon an old GPS receiver – the Magellan RoadMate 1412. Magellan units are known to run Windows CE, so naturally I had to purchase it and do what needed to be done: run Doom on it. After shelling out $10 CAD for the receiver and its slightly-damaged car charging cord (no mounting bracket, though), it was time to take it home, clean it up, and send it to its Doom… (heh, get it? … I’ll see myself out.)

The Magellan RoadMate 1412 running Doom!

The Magellan RoadMate 1412 running Doom! It wasn’t nearly as easy as I first thought…

Considering that the device runs Windows CE 5.0, I figured that running Doom on it would be a piece of cake once I gain access to the underlying operating system. Should be pretty straightforward… right?

Wrong! Feel free to join me in my adventures in running Doom on a (perhaps excessively) neutered stripped-down Windows CE device.

Stage 1: Power-Up

Turning on the unit wasn’t particularly exciting, although I expected the unit to not work until the internal battery received at least some charge. However, within seconds of me connecting the car charger to it, it powered on with the Magellan logo and a progress bar, before transitioning to the boot logo and the usual “don’t crash your car” warnings that GPS units tend to display on power-up.

 

 

 

Unsurprisingly, the unit still works as a GPS unit, and there’s no immediately apparent way to exit the navigation app and slip out into Windows CE itself. However, after some searching, I found out that plugging in the unit causes it to appear on a computer as a USB drive, with the navigation app’s files all visible as a FAT32-formatted 2-gigabyte volume.

 

 

 

Stage 2: Not-so-Total Command

The navigation app’s files are stored in the “APP” directory, with the executables “Navigator.exe” and “mgnShell.exe” catching my eye in particular. I renamed the Navigator app to “MagNavigator.exe” as to allow a relatively easy revert in case things go wrong, and I copied TotalCommander/CE to replace the original navigation app. Rebooting the receiver didn’t appear to change anything until I dismissed the initial legal warning. Once I closed the notice, I was sent right into TotalCommander/CE, but with no taskbar in sight.

 

 

 

One strange thing I noticed was a distinct lack of icons in the folder list. Scrolling through the “\Windows\” directory revealed some rather disappointing revelations: we have no command prompt, on-screen keyboard, or Explorer shell (therefore, no desktop, Start menu or taskbar).

This meant that running Doom wouldn’t be nearly as easy as it was when I got it running on my Keysight DSOX1102G oscilloscope, as that device still had a (mostly) full Explorer shell and command prompt. Windows .lnk shortcuts failed to work either, causing TotalCommander to just redirect to the system root directory – and forget trying to use batch files! The Explorer components are so absent, third-party apps can’t even display Open or Save dialogs…

This presented a significant roadblock: Chocolate Doom for Windows CE requires command-line arguments. How am I supposed to run this with no Explorer shell or command prompt?!

Stage 3: Getting (Mort)Scripty

Without access to native Windows CE tools, I needed to find a third-party solution to run programs with command-line arguments – maybe some sort of scripting engine.

Enter MortScript. It’s a lightweight yet powerful scripting engine, with a Visual Basic-like syntax. It’s instrumental in making GPS mods like MioPocket possible. (On that note, before this point I wasn’t even aware that MioPocket even existed; I learned that it provides a lot of Windows CE functionality that’s otherwise absent on GPS units like mine, but I’ve come this far – I was determined to forge my own path to Doom.)

 

 

 

I tinkered with the example command syntax and used its Run() command to send the correct command-line arguments for Chocolate Doom: “chocolate-doom.exe -iwad [wad path]“. After running MortScript for the first time to register the .mscr file extension, I was ready to make my script:

# LaunchDoom1.mscr
#########################################
# Chocolate Doom for Windows CE Startup Script by Jason Gin
# Visit: https://ripitapart.com
# Version 1.0 (initial release)
#########################################

# Resource Paths (WAD and executable)
DoomWadPath = "\SDMMC\Fun\chocolate-doom-1.3.0-wince\Doom1.wad"
DoomExePath = "\SDMMC\Fun\chocolate-doom-1.3.0-wince\chocolate-doom.exe"

#########################################

# Step 1: Create required command-line arguments to start Doom:
DoomExeArgs = "-iwad " & DoomWadPath

# Step 2: Run Doom!
# Error checking is performed on whether the WAD file path is valid.
# This assumes that the executable path is valid.
# If it isn't, then nothing happens anyway.

If (FileExists(DoomWadPath))
Run(DoomExePath,DoomExeArgs)
Else
Message ("Error: Unable to find the WAD file at " & DoomWadPath & "!")
EndIf

Stage 4: Doom’d!

After running the script, I declared success: it runs Doom! The window was limited to its lowest supported resolution of 256×200, but running “chocolate-setup.exe” I was able to set it to 320×240 – not great, not terrible. Oh, and it also runs Duke Nukem 3D.

 

 

 

Unlike my previous Doom hack, I wanted to make the window fill the screen but not be in full-screen mode (without any physical buttons, I would otherwise be unable to regain control of the operating system). After some tinkering with the configuration files stored in “\My Documents\.chocolate-doom\chocolate-doom.cfg“, I changed the following settings:

autoadjust_video_settings     0
fullscreen                    0
aspect_ratio_correct          1
screen_width                  480
screen_height                 247

With this adjustment saved, I was able to get a more immersive Doom experience on my GPS receiver.

The Magellan RoadMate 1412 running Doom.

Magellan RoadMate 1412 Runs Doom

Extras

I wanted to see if I could get an external keyboard working on the receiver so I can actually play Doom instead of just letting the on-screen demo do its thing, so I rigged up a mini-USB On-The-Go (USB OTG) cord with a homemade USB B-to-A adapter so I can plug in USB peripherals to the charging port. Unfortunately, it seems that the receiver only supports USB drives, and refuses to work with USB input devices (like mice and keyboards); it doesn’t even work with USB hubs! Oh well, it was worth a try.

 

 

 

Conclusion

As is with many things in life, projects are often much more complicated than it might seem. Just because a certain device runs Windows CE doesn’t mean that all the of the expected components are actually available (after all, it is meant to be a highly modular embedded operating system). However, with the right tools, it can be done – and it keeps the adage of “It Runs Doom!” alive.

Resurrecting a dead MacBook Pro (mid-2012 13-inch, model A1278)

As seen on Hackaday!

A couple weeks ago, I picked up a dead MacBook Pro that was on its way to the recycle bin, and was curious as to whether I would be able to fix it. It had a note attached to it citing several issues with the computer: the display doesn’t work, the battery doesn’t charge, one of the USB ports doesn’t work, and it won’t load an operating system. It certainly didn’t look particularly promising, but I felt it would be a good way to test my skills in component-level repair – with a pretty nice prize if I succeeded.

Triage

The computer I picked up is a mid-2012 MacBook Pro by Apple; it is the A1278 model with a logic board number of 820-3115-B, and it comes with an i7-3520M CPU and 8 GB of DDR3 RAM – however, the hard drive was taken out of the computer by the time I received it. As previously noted, the computer had a laundry list of issues that were certainly the reason the original owner decided to discard their computer – a laptop that doesn’t boot nor have a display isn’t a particularly useful one.

Connecting a MagSafe AC adapter to the computer revealed even more issues: even though the unit was already noted that it wouldn’t charge, I noticed there was no LED indicator on the power adapter’s plug, and the computer wouldn’t power on, even with external power connected; the only sign of life was one of the LED level indicators rapidly flashing when I pressed the button. With this functionality test being unsuccessful, I decided to open up the computer to see what else was wrong…

Troubleshooting & Diagnosis

Unscrewing the bottom cover revealed what horrors the computer had experienced. There was clear evidence that it had suffered from liquid damage: rampant corrosion around the LCD connector and some of the power circuitry, and some of the corrosion deposits were even left on the computer’s bottom cover! If you watch Louis Rossmann’s videos, you would know that liquid damage rarely is an easy fix, especially when high-voltage LED backlight circuitry gets involved.

Liberal use of a 70% isopropyl alcohol solution and a brush was able to scrub away all the corrosion on the computer’s logic board, and the results were not pretty:

Many PCB test pads were either corroded or entirely gone, the backlight fuse (and its pads) were nowhere to be found, and some ICs were missing entire pins! Whatever was spilled on this area of the MacBook certainly had some corrosive properties to it, and it looks like nothing was done to stop the initial damage. With a schematic and board-view software in hand, it was time to investigate what particular components had suffered damage.

Power Supply

Before any device can perform any useful functions, it needs power. I reconnected the AC adapter and started to check the voltages around the DC input jack and its surrounding support circuitry. Since I was able to press the MacBook’s battery indicator and get some response when connected to power, I knew that the main input fuse was intact, and that the SMC (System Management Controller) chip was receiving power via the PP3V42_G3H rail and functional; the G3H (G3-Hot) designation means that the power rail is always on, even if the computer is otherwise turned off. I checked the voltage at the DC jack’s ADAPTER_SENSE line, which is normally at approximately 3.3 volts and uses the 1-Wire protocol to communicate with the power adapter and control the LED on the power adapter’s MagSafe plug. To my surprise, it was at a staggering 16 volts, which meant that something was shorting the DC input voltage (about 16.5 volts) onto a low-voltage communication line – no wonder there wasn’t any LED indicator when I plugged it in! A multimeter measurement found about 2 kOhms of resistance from the power line to the communication line. Thankfully the MacBook’s logic board features a MAX9940 1-Wire overvoltage protection chip, which is rated to protect against voltages as high as 30 volts. I scavenged another DC input connector from an older, dead MacBook which shared the same connector and pinout. After connecting this to the logic board, I found that I got a green LED upon connecting the AC adapter, and the CPU fan started spinning; this is a very good sign as this means the main power circuitry is intact. Measuring the CPU’s Vcore voltage revealed a voltage of about 0.8 volts, which is normal for a modern laptop CPU. With the “heart” of the computer checked out, it was time to focus on the area most affected by the water damage.

LCD & Backlight

Examining the backlight connector and its surrounding circuitry revealed significant damage to many components and the PCB itself. The power supply pins on the LCD connector showed a significant amount of corrosion, and I was concerned that the backlight’s output voltage (up to 52 volts!) could have made its way through all the corrosion residue and damaged critical data lines between the display and the graphics controller. I noticed the backlight fuse (F9700, a 3-amp 0603-size fuse) had gone more than just open-circuit – I couldn’t even find the fuse or its corresponding PCB pads initially! I then probed the LCD connector and found that the display’s 3.3-volt power lines were open-circuit; the corrosion had eaten through the traces between the connector and its decoupling capacitors nearby. Using a diode-mode measurement on the FPD-Link (often called LVDS) lines revealed that the connections were intact; there weren’t any anomalous readings or short-circuits on those lines, the DDC (Display Data Channel) lines, and the 3.3-volt power lines.

Due to the high voltages used to drive LED backlights, I had my suspicions on U9701 (a Texas Instruments LP8550 LED backlight driver). It’s a tiny ball-grid array (BGA) package, and attempts to clean the chip from its edges didn’t seem to do much. Its corrosion looked limited at first – only the feedback line’s probe point was lost – but I was sure the chip was on its last legs (or is it balls?).

Power Management

The LCD connector is in close proximity to the computer’s DC input and its “PPBUS_G3Hot” power rail, which is always on (even if the computer is otherwise turned off), which exacerbates any corrosion due to liquid damage due to its high voltage. Further examination revealed significant corrosion on the outside of the CPU’s high-side current sense resistor (R5400), and the current-measurement pins (pins 4 and 5) on U5400 (a Texas Instruments INA213 current-sense amplifier) were completely gone! Clearly there was no way to salvage that component.

There was significant damage to the SMC’s DC input voltage sense circuitry (“VD0R”), with pins 3 and 4 of Q5490 (an ON Semiconductor NTUD3169CZ complementary pair of N-channel and P-channel MOSFETs) being completely eroded away, much like U5400’s current-sense pins; this part of the circuit uses a P-channel MOSFET to switch on a resistive voltage divider, allowing the SMC to measure what the voltage is on its MagSafe input connector. Also, many of the probe points related to that circuit were also completely eroded, revealing dark pits instead of silver-plated copper pads.

FireWire

The FireWire circuit wasn’t spared from the carnage, either. Pins 3 and 4 on Q4262 (a Diodes Incorporated BSS8042DW complementary MOSFET pair) were also severely damaged; these pins are used to quickly disable the FireWire power output transistor (Q4260, an ON Semiconductor FDC638P P-channel MOSFET) in case of a “Late-VG event“. This occurs if the ground pins of the FireWire connector are mated too late when plugging in a device – this creates a dangerous overvoltage condition on the FireWire data lines, as up to 30 volts briefly find a return path through the data lines, risking damage to the device and host controller. I wasn’t as concerned with this circuit, as I don’t have any FireWire peripherals, and the circuit in its current state simply means the FireWire port will be unable to disconnect power if a bad cord is plugged in.

Thunderbolt

The area that had the least liquid damage was C3897, which belonged to U3890 (a Linear Technologies LT3957, a 15-volt boost converter for the MacBook’s T29 chip and Thunderbolt interface). All this area needed was a bit of corrosion cleanup.

USB Port

During the functionality tests, I noticed the metal casings of the USB port were getting very hot to the touch, and I nearly burned myself on U4600 (a Texas Instruments TPS2561, a dual-channel load switch with internal current limiting)! I found a short-to-ground problem on a power line on one of the USB ports, which explains the symptom listed on the note. I desoldered the chip, initially thinking the issue was in the chip itself, but the fault remained. I narrowed the problem down to C4695, a 10-microfarad ceramic capacitor that had short-circuited internally; this caused the TPS2561 to go into current-limiting mode, which turns the chip into a resistor and dissipate copious amounts of heat into the PCB, which made its way to the USB ports (and then my fingers – ouch).

Hard Drive Cable

During the repair process, I was able to install Mac OS X Lion to a SATA SSD, but soon found the MacBook unable to recognize SSDs, despite hard disk drives showing up just fine! As it turns out, the A1278 is notorious for bad HDD cables, with even replacements failing within months of installation. This appeared to be caused by chronic frictional damage, as the cable is sandwiched between the hard drive and the MacBook’s rough aluminum casing – even regular use of the laptop was found to create hairline cracks in the cable. Thankfully replacements are relatively inexpensive, and a little bit of Kapton tape as a barrier against the casing was the “vaccine” against future cable failures.

Repairs

With all of the problems written down, it was time to start fixing up the MacBook. Time to break out the hot air rework station, soldering iron, solder, magnet wire, and plenty of flux!

DC Input Jack

I desoldered the DC input jack, and found there was a lot of corrosion residue bridging the +16.5-volt power line to the ADAPTER_SENSE 1-Wire communication line.

With some isopropyl alcohol and some scrubbing with a small brush, I was able to clean up the corrosion and resoldered the jack into place. A quick multimeter test found that there was no more 2 kOhms of resistance from the power to the data line, and I was able to get an LED indication when I plugged in the AC adapter, including an orange light that indicates the battery is charging.

LCD Connector

I wanted to determine if the display was still functional, so I first focused my attention to the LCD connector, even if I had to eschew the LED backlight for a bit.

I ran a jumper wire from L9004 to pins 2 and 3 of the LCD connector; this belongs to PP3V3_LCDVDD_SW_F, which provides the 3.3-volt power to run the LCD panel except the backlight. After cleaning out the flux and corrosion on the logic board’s connector as well as the LCD cable, I was able to get an image on the display!

USB Port

With the faulty component identified, I replaced C4695 with an identically-rated 10-microfarad 6.3-volt X5R ceramic capacitor in an 0603-sized package. After replacing the capacitor, the USB port was fully functional again!

Current-Sense Amplifier

After ordering both the INA213 and LP8550 from Texas Instruments, it was only a few days before they arrived in the mail. I desoldered the dead chip from the logic board, cleaned up the pads with some flux and desoldering braid, and installed the new chip. Running Apple Service Diagnostic tools showed that the current-sensing circuit was working correctly.

DC Input Voltage Divider Switch

I didn’t want to buy another transistor pair for Q5490, so I replaced the P-channel half with an ON Semiconductor NTK3142P P-channel MOSFET that I salvaged from an older donor MacBook logic board. I scraped away some solder mask on one of the broken traces heading to the SMC’s voltage divider so I could solder the transistor’s drain terminal to it, and used magnet wire to connect the transistor’s gate and source to their corresponding locations across R5491. R5494, leading to PM_SUS_EN, was found to have a 0-ohm resistor that was open-circuit; this was easily bypassed with a wire jumper across the resistor’s original pads. After cleaning off the flux and performing continuity measurements, I measured the voltage at the SMC’s voltage divider resistors and got a valid voltage reading when I plugged in the AC adapter.

LED Backlight Driver

The LP8550 was up next for repair. I took a 2-amp 0603-sized fuse from a dead hard drive, and used some magnet wire to reattach it to the remnants of F9700, which was a 3-amp fuse originally; note that it’s far safer to use a fuse of a smaller rating instead of a larger one, should a circuit fault still exist.

Tracing the other lines to the LP8550 revealed that R9731 (leading to PPBUS_SW_LCDBKLT_PWR) was open-circuit at a via, which was easily bridged with some solder and magnet wire. R9010 (leading to PPBUS_SW_BKL) was open as well.

After reinstalling the fuse, I actually got the backlight working! However, upon a power cycle I heard a snap, saw a puff of smoke, and lost the original backlight chip. Chances are there was indeed some corrosion residue had caused 50-odd volts to end up on a more sensitive pin on the LP8550. I used an Xacto knife to lightly scratch an outline around the chip, then used copious amounts of flux and desoldered the dead chip with my hot air rework station; I also removed the fuse to help in further troubleshooting to ensure that there weren’t any short-circuits to ground on the backlight circuit. I cleaned up the area with leaded solder and some solder wick, and cleaned up the residual flux in anticipation of the new chip’s installation.

The chip was remarkably easy to install – just get the A1 ball lined up according to the board view, and heat the board to the right temperature. After thoroughly cleaning away the flux from the area, I turned on the MacBook… and let there be (back)light! I power-cycled the computer and the LED backlight remained functional! (And for the record, the fuse didn’t even blow during the entire ordeal.)

FireWire Late-VG Protection Circuit

I considered this issue to be a “WONTFIX, as I had no use for FireWire connectivity (nor do I have the correct FireWire 800 cables anyway). If I want to sell this computer, I might install a P-channel MOSFET to replace Q4262 (see the LCD Connector section above) in a similar fashion to the DC input voltage-sensing circuitry.

UPDATE (February 26, 2020): Actually, several months ago I decided to finish things once and for all, so I replaced the damaged P-channel MOSFET circuitry. Do I have much use for FireWire? Not really, but it at least means I can experiment with it or something later on, if desired.

Testing

It takes a little bit of Google-Fu, but with the help of a BitTorrent client, I downloaded the disk images to create an Apple Service Diagnostic (ASD) drive. This is far more sophisticated than the built-in diagnostic when you boot the computer while holding down the D key. With ASD, one has the option to use a stripped-down version of Mac OS X – in a similar vein as WinPE – or a very lightweight UEFI (Universal Extensible Firmware Interface) environment that looks very much like Mac OS 9 and earlier.

It took over half an hour, but all the tests passed without a problem, since all the sensor readings were valid. My MacBook Pro has been restored to working order! I installed Mac OS X High Sierra to a 1TB SSD, and used Boot Camp to run Windows 10 Pro as the default operating system (what can I say, I like Windows 🙂 ). The Mac Precision Touchpad driver project makes the touchpad a pleasure to use, as the built-in Boot Camp driver provides a much less-comfortable experience.

Conclusion

Much like solving a puzzle, component-level troubleshooting of modern electronics is possible, but this is only feasible if the relevant documentation exists as a good reference point. One can do without them, but the act of reverse engineering isn’t easy if one only has a non-working device.

With the help of a schematic and board view (including the open-source software OpenBoardView), one can easily find what circuits a component belongs to, and where it goes. By following the connections, one can track down the problem(s) with the board, and hopefully save a device from an untimely end in the landfill or a recycling facility.

Right to Repair

This project is an example of why I believe in the right to repair. If I didn’t have (even unofficial) access to schematics, board views, and diagnostic software, I wouldn’t have been able to bring this dead MacBook Pro back to life. However, with a little bit of electronics troubleshooting knowledge and skill, I was delighted that I diverted a discarded dysfunctional device from a demise in the dumpster. In fact, this blog post was written from the MacBook I just repaired!

Atomic Pi Adventures, Episode 1: Adding external PCI Express expansion by removing onboard Ethernet

As seen on Hackaday!

TL;DR: The Atomic Pi single-board computer CAN be expanded through PCIe. It’s just a massive pain to do so, even if you have steady hands. Let’s just say it’s a long story…

DISCLAIMER: The modification performed in this blog post can, and has, caused permanent hardware damage to my Atomic Pi, albeit repairable with much skill and effort. Reenacting what I’ve done requires significant experience with SMT (surface-mount technology) components, some barely larger than a grain of sand (I consider 0402-size components to be “oversize” in this instance). I accept no responsibility for damages arising from attempting this modification.

Introduction

Single-board computers (SBCs) are all the rage nowadays, with the Raspberry Pi being the most well-known in this category. SBCs are compact computers, carrying their own CPU and memory, and usually some on-board storage and various I/O connections (e.g. USB, HDMI, Ethernet). Most of these computers use the ARM architecture, found on almost all mobile devices today. However, some use the x86 architecture, which is used in higher-end tablets, laptops and desktop computers.

Recently, the Atomic Pi made waves in the electronics hobbyist space, boasting an Intel Atom Z8350 quad-core CPU with 2 GB of RAM, 16GB of eMMC storage, Gigabit Ethernet, Wi-Fi, USB 3.0, built-in speaker amplifiers, and lots of general-purpose I/O (GPIO) pins – all for less than $40 USD!

As one might expect, there were caveats to this little computer, with some dismissing it very harshly, if not unfairly. With some investigative work, members of the community found out that the “Atomic Pi” board actually belonged to the Kuri robot from Mayfield Robotics; the company shut down in late 2018, and the liquidated stock of these SBCs were snatched up by Digital Loggers with the help of a Kickstarter campaign, who then developed breakout boards to make using them easier. This is because – unlike almost every other SBC – you can’t just plug in a barrel jack or USB cord to power it! Instead, it uses a 2×13 pin header, which many users did not have on hand, nor had the skill and/or resources to build their own solution. This is compounded by the board’s need for clean, well-regulated 5 volts at 2-4 amps, with 12 volts being optional to power the onboard speaker amplifier. The 16 gigabytes of eMMC storage proved to be too cramped to run Windows 10 directly, and the Realtek RTL8111G Gigabit Ethernet chipset is often frowned upon by those in the pfSense (a free firewall/router OS) community.

The NIC’s usage of the Z8350’s single PCI Express (PCIe) lane is what caught my attention. Unfortunately, the RTL8111G Ethernet chip is soldered to the board, with no easy method to replace it with an external card. A few people have attempted to remove the chip and wire in an external PCIe riser, without success. With my previous experience with fine-pitch electronics work, I figured that this would be a fairly easy modification to make.

It wasn’t. In fact, this was one of the most frustrating electronics projects I’ve done to date – and now you get to come with me in my adventures (and misadventures) in microsoldering on the Atomic Pi (or was it Kuri? The jury seems to be out on the nomenclature).

Optional Reading: PCI Express Signals

PCI Express (or PCIe for short) is a very common high-speed interface for connecting processors or chipsets to all sorts of different peripherals. It uses low-voltage differential signaling to minimize interference, and a single lane of PCIe can carry 250 MB/s for the first generation, all the way up to 2 GB/s for the fourth generation!

A single PCIe lane is made up of three differential pairs: receive, transmit, and a 100 MHz reference clock. Control signals for waking up and resetting peripherals are also provided. Riser cards, often used for cryptocurrency mining, use these five signals to provide the minimum connectivity for any PCIe peripheral. The interface is highly flexible, allowing graphics cards that normally use 16 lanes to run on just one lane. The specifications for PCIe make board design easier, as the differential lines are designed to adapt to different lane configurations; the polarities in a pair (transmit, receive, clock) are polarity independent (all that matters is that you maintain a good differential pair).

PCIe Signal Pinout

Since the Atomic Pi lacks a PCIe slot of any kind, I have provided the diagram indicating which signals go to what points on the board:

Note that the AC coupling capacitors are required for the peripheral Rx (receive) pair, and are optional for the peripheral Tx (transmit) pair.

Attempt 1: “Chipple” Bypass

I started this project with the intent of making my modifications as reversible as possible. Looking at the RTL8111E’s datasheet (a similar model to the 8111G) revealed the presence of an “ISOLATEB” pin; grounding this pin causes the chip to disable itself and release its hold on the PCI Express lines.

This brings us to the first roadblock: the Ethernet chip is soldered to the board, and the components that connect it are known as 0201 (0.002 inches in length, 0.001 inches in width), smaller than a breadcrumb!

After disabling the Ethernet chip, I used my trusty 0.1mm (that’s four-thousandths of an inch!) magnet wire to tap into the PCIe lines, and brought them out to a PCIe riser card, which provided a USB 3.0-type socket for an external connection.

Unfortunately, high-speed signals aren’t just about wiring from one device to the next, and this was no exception. The rat’s nest of wires were no suitable medium for the 5-gigabit signals to traverse (PCIe requires very tight control of electrical trace layouts), and the Atomic Pi was unable to detect the presence of any device on the external PCIe port.

Attempt 2: Thin Twinax Troubles

With the signaling issue in mind, I tried some very thin twinaxial cables from a dead MacBook’s hard drive cable to connect the PCIe lines to my riser card. This cable is very thin, with an outer thickness similar to fishing line. It uses a foil and wire “shield” around the two internal wires to protect it from external interference. I figured that this should help protect the delicate signals from the harsh outside world.

I didn’t have very much of this thin twinax cable on hand (that said, if anyone knows where I can buy this stuff in bulk, please let me know!), so I was limited to very short lengths for each pair. I ran out of twinax after the transmit and receive pairs, and resorted to using two coaxial cables from that same cable bundle.

After fiddling with the super-thin wires and soldering them to the header, I still got nowhere and could not get the Atomic Pi to see an external PCIe card.

Removing my modifications revealed my first blunder: despite placing the chip into a temporarily disabled state, the Ethernet chip was dead! This meant that I could not use the built-in Ethernet port anymore, and the LEDs next to the Ethernet port simply glowed a dull orange instead of their rapid blinking pattern when data is being transmitted over Ethernet. I figured that there was no use keeping the chip on the board, so I used my hot air rework station to remove it, using a generous amount of heat and flux to get the chip removed.

Attempt 3: To Ribbons, You Say?

With the dead Ethernet chip removed, I decided to use another tactic to bring out the PCIe interface. I decided to use a thin ribbon cable (okay, more accurately a “flat flexible cable”) to help decouple any mechanical stresses from moving the external USB 3.0 connector around during testing, and its compact size allowed me to try using the QFN pads to connect my ribbon cable, hopefully minimizing noise that could be picked up, as well as avoiding any “stubs” of wiring within the differential pair that would degrade the signals.

The challenge is that the chip uses a very small pad spacing, and I wanted to avoid soldering directly to the ribbon cable. I managed to salvage a couple connectors from some other devices, and used a cut-up CompactFlash memory card’s PCB to make it easier to solder the connector, as well as provide a base for the connector to sit on.

The connector on the board was affixed onto the Ethernet chip’s footprint with the help of some double-sided foam tape, and magnet wire was used to bond the PCIe signal wires onto the ribbon cable. The ribbon itself used copper foil on one side to help with interference suppression, and provide the signals a “ground plane” to travel across.

Things weren’t quite as elegant on the other end of the cable, however. I still had to contend with a fine-pitch ribbon connector, and a way to connect that to a male USB 3.0 plug to hook it up to the PCIe riser. I had little option except to use the small lengths of twinax from the last attempt, and (perhaps unsurprisingly) it didn’t work.

Attempt 4: Teeny Tiny Twists

The next attempt went to a smaller scale, using a very fine twisted pair I created out of my 40 AWG magnet wire (each wire is as thick as a hair). I then shielded this magnet wire by sandwiching it between pieces of copper tape to act as a ground plane for the signals. I reused a dead USB 3.0 drive to act as the plug for the PCIe riser, and wired it straight to the QFN pads of the original Ethernet chip.

Although the twisted pair would have, in theory, reduced intra-pair skew and some degree of EMI resistance thanks to the copper tape, the homemade twisted pair almost certainly would not have provided a 100-ohm differential impedance that’s required for PCIe signaling. Once again, the attempt to break out PCIe from the Atomic Pi was a failure.

Attempt 5: SASsiness Yields Results…?

Since the central issue with external PCIe connectivity involved the connection between the Atomic Pi board and the riser, I figured I would try a medium that was specifically intended for high-speed differential signals. I bought some thin SATA cables (specifically the 18-inch thin cables on Amazon), which used two pairs of thin SAS cable per SATA connection. The difference between SATA, SAS and PCIe is of little difference here, as the key criteria was a differential pair with 100-ohm impedance and ability to carry high-frequency signals.

Despite being a thin cable (each conductor is only 30 AWG, or 0.25 mm in diameter), these cables were too stiff to directly connect to the AC coupling capacitors without a very high risk of damaging the capacitor and/or the pad it is soldered to. I had to reinforce the cables by soldering them down and hot gluing the cables to the board at multiple points to relieve stress on the connections.

The original 0201-sized AC coupling capacitors were removed, and more reasonably-sized 0402 capacitors were used in their place. Each capacitor was a common 100nF capacitance, and were easy to harvest from some dead laptop motherboards I had on hand. I “flipped” the layout of the capacitors away from the QFN footprint, but this meant that only one side of the capacitor was actually anchored to the board; this made the capacitors very fragile and I often lost the terminations on the capacitors (thankfully they are plentiful in electronic devices like this).

To help minimize stress on the vulnerable capacitors, I used magnet wire as a flexible “bridge” between the capacitor and the SAS cable, and the cables were tied down to solder tab wire that I used as “bus bars” for a strong electrical ground as well as a tie-down point to take away most of the strain of the cable’s flexion. Despite negatively affecting signal integrity, I was able to get a sufficiently stable connection to perform some initial testing, and I succeeded! I was able to connect an Intel 82576-based dual-port network card to my external riser.

However, this didn’t last long. The simple movement of the wires on the board when trying to connect different peripherals was enough to break the AC coupling capacitors, despite the use of my flexible terminations from the SAS cable to the capacitors. Replacing the damaged capacitors and redoing the magnet wire terminations failed to restore connectivity, so I desoldered all of the existing connections and started over.

Attempt 6: We Have Lift-Off! (that’s bad)

In an attempt to further improve signal integrity, I decided to take the bold move of eschewing the flexible terminations of magnet wire, and decided to directly solder the SAS cable’s wires to the capacitors, which are soldered to the PCIe pads on the Atomic Pi. I anticipated that physical stresses would damage the capacitors, so I opted to use longer lengths of SAS cable, and to hot-glue the cable to the board at regular intervals to minimize the amount of stress that gets coupled into the cable and capacitors. The 100 MHz reference clock continued to use magnet wire for connection as it was at a much lower frequency than the PCIe signals, and reduced the amount of physical crowding around the PCIe pads.

This leads me to my next issue. To help with structural integrity, I repositioned one of the SAS cables so that it would remain on the board before going to the PCIe riser connector, meaning it would be perpendicular to the other cables. This required me to strip extra foil shielding to jump over the existing capacitors, which increases the risk of physical stress on the capacitors, as well as reduced signal integrity. Additionally, I used a longer set of twinax cables, sacrificing two SATA cables to get most of the original 18 inches of length per cable.

Testing of this construction method failed, and it was only at this point that I realized that two of the four AC coupling capacitors I used weren’t the same capacitance; PCIe usually uses 100 nF capacitors, but I had a 1 nF on one PCIe pair, and 10 nF on the other – no wonder things weren’t communicating (and that’s what I get for assuming the capacitors were the same)! Unfortunately, the process of removing the SAS cable connections resulted in the phenomenon I was trying to avoid: the board’s PERp0 (the positive side of the PCIe receiver) pad had lifted away from the PCB, leaving me with an unsolderable crater.

Attempt 7: Success!

After losing one of the pads for the original coupling capacitors, I was feeling pretty defeated. I didn’t let this stop me, and I decided to apply my magic skills with 40-gauge magnet wire to the trace, and was able to get a sufficient connection with a replacement 100 nF capacitor (and this time I measured it…), hopefully in such a way as to not too negatively affect signal integrity. Unfortunately, the 0201-sized resistors used to connect the PCIe reset and wake signals lost their terminations and no longer took solder on one end; I opted to keep the resistors in place and soldered magnet wire to the other end (facing the SoC rather than the Ethernet chip), as further measurement revealed that the resistors were simply 0-ohm jumpers anyway. I scrapped the longer SAS cables, and went back to the ~8-inch lengths to reduce attenuation.

After rewiring the PCIe data, clock and control lines, I crossed my fingers and retried the riser with my Intel NIC – and it worked! After verifying the connections were sufficiently strong, I decided to make the modification permanent, and covered the area with epoxy to prevent any of the components from breaking loose. Additionally, I used some aluminum sheet metal, double-sided foam tape and some zip ties to create a reinforcement bar, helping to strengthen the cable connections as they leave the board.

With all the connections in place, it was time to get to the fun part: seeing what PCIe peripherals work on the Atomic Pi!

Testing Results

One interesting behaviour of the Atomic Pi when using Windows is that changing PCIe cards often causes the system to immediately power down or freeze during boot. Simply powering the board on again seems to fix this issue. Why am I using Windows? I already upgraded the board to a 64GB eMMC instead of the original 16GB, and the appeal of the Atomic Pi’s x86 architecture was the ability to run desktop apps – in particular, Windows apps.

The UEFI firmware has a hidden advanced menu, accessible if the “shutdown /r /fw” command is run as administrator. There is a section for PCIe settings, and the ones that interested me were the AER (Advanced Error Reporting) and PCIe hot-plug settings. Unfortunately, these do not seem to have any effect as Windows doesn’t seem to pick up any hardware events in the Event Log.

Test 1: Network Devices

PASS: Intel 82576 Dual-Port Adapter

If the port(s) have a connection to another device (e.g. a network switch), the orange and green LEDs will illuminate soon after the PCIe link is properly initialized, which makes for quicker troubleshooting. Even if the system is off, the green link LED remains lit, but goes out once the PCIe bus is reset.

The card works a treat in pfSense and Windows, allowing high-performance Gigabit-speed transfers with both ports in use. However, the UEFI firmware does not recognize the card as a bootable medium (which is for the best in my opinion – the PXE boot program tends to hinder the boot process more than anything).

PASS: Realtek RTL8111GS (External Card)

The PXE network boot program built into the UEFI firmware works just fine, which is expected as the chip is the same type as the original RTL8111G, with the exception that the chip has a built-in switching converter rather than the linear LDO that the Atomic Pi uses natively.

PASS: JMicron JMC250

The card works in Windows and pfSense; the particular card I have has always been flaky in operation, and this was no exception.

PASS: ASUS PCE-N15 (Realtek RTL8192CE) 802.11n WLAN Adapter

Using this card is a performance downgrade compared to the dual-band adapter already present on the Atomic Pi, but it did function correctly in Windows.

Test 2: External Graphics Cards (eGPU)

PASS: EVGA/nVidia 8500 GT

The UEFI firmware doesn’t seem to want to acknowledge the presence of PCIe graphics adapters, which results in issues when testing in Windows. Initially, the card failed to enumerate correctly, identifying as a Microsoft Basic Display Adapter and displaying an error in Device Manager:

This device is not working properly because Windows cannot load the drivers required for this device. (Code 31)

The driver trying to start is not the same as the driver for the POSTed display adapter.

Manually downloading and installing the drivers allowed the card to work properly. As previously mentioned, the UEFI does not recognize the presence of the graphics card, so a monitor that is plugged into the graphics card won’t start working until Windows is loaded and the graphics driver initializes.

Additionally, the adapter’s resources won’t be used if the monitor is connected to the Atomic Pi’s built-in HDMI port – it’s a tradeoff between graphics performance and the ability to see what’s happening on boot; maybe this is different in a multi-monitor configuration, but I didn’t have enough desk space to test this.

PASS: XFX/AMD Radeon 4650

The same driver issues popped up when using this card as well, and the same fix applies.

PASS: ASUS/nVidia GTX 660 Ti

Ditto. This card was the best I had on hand for testing, and amusingly it consumes about 10-20 times the amount of power that the Atomic Pi uses, and is bigger in size as well.

Connecting this card allowed me to run games that would otherwise simply not work, such as Just Cause 2. It performed at about 20-30 frames per second: not great but not terrible. Just Cause 3 was absolutely painful to run, but this is not surprising as the Atomic Pi’s hardware is far below the game’s minimum requirements in almost all aspects.

Test 3: Storage

PASS: Dell PERC H200 SAS Adapter

To my surprise, not only will it work in Windows, it even supports booting from the UEFI! It mentions a SAS controller driver during boot, but I am not sure as to whether this is present in the Atomic Pi’s UEFI, or if it is a driver provided by the SAS adapter itself. The PCIe 2.0 1x interface limits the maximum throughput to less than 500 MB/s, which is slower than a single SATA III connection, and informal testing showed a maximum speed of about 330 MB/s. I have not investigated whether I could configure the hardware RAID features of the adapter from the operating system, but I imagine that doing so would dramatically improve performance when running a RAID 1 (mirrored) setup.

FAIL: OWC (What’s This?) Aura Pro X 480GB NVMe SSD

This SSD was originally meant as an aftermarket SSD upgrade for a MacBook Air, but I bought a PCIe adapter card to use it in a regular PC. I was able to get it to enumerate in Windows, but any sort of read/write action caused it to lock up and throw errors in the Windows Event Log. I suspect this is likely a power supply issue due to how the PCIe riser board provides power (12 volts -> DC/DC converter -> 3.3V LDO linear regulator), or maybe the card just doesn’t want to cooperate.

FAIL: Marvell 88SE9128 2x SATA III + 1x PATA UDMA-133

This adapter caused Windows to almost freeze during boot. Instead, the boot animation would advance by one frame every 2 seconds until the card is removed. I suspect this is the PCIe bus attempting to negotiate a connection but failing, possibly due to signal integrity issues. Checking the Windows Event Log turned up nothing either.

Test 4: … More PCIe?

FAIL: Extender Cable

One would think that a simple straight-through cable would work, right? Nope – it seems like the signal has already been degraded enough after traveling across multiple non-ideal connections, and the extra length in a cable was just enough to degrade the signal beyond recovery.

PASS: PCIe Port Multiplier / ASMedia ASM1184e

UPDATE (July 3, 2019): It seems that the “USB 3.0” pinout on these PCIe risers is inconsistent. I was somehow able to make do by using a PCIe card plugged into my current riser, which then led to the ASM1184e board, which splits the PCIe 2.0 1x lane into four slots. It seems like the chip is able to handle the attenuation over such a long cable length, and it even includes a 100 MHz clock buffer, effectively “refreshing” the signal for more sensitive PCIe cards.

I tried the 88SE9128 card again, but still had no luck. Ethernet cards worked without issue, but I wasn’t able to test the graphics cards as the ASM1184e board’s PCIe slots are closed-ended 1x slots. I could try cutting the edge of the connector to allow for larger cards, but that’s an experiment for another day.

Conclusion

Despite all the roadblocks I encountered in this project (sometimes rage-inducing ones), I still enjoyed this project and the various techniques I tried along the way. The prospect of attaching desktop PCI Express peripherals to a CPU designed for a tablet opens up many different applications for the Atomic Pi, including ones that would otherwise have been less efficient or impossible using the board’s built-in hardware (e.g. dual Gigabit Ethernet without using USB, or GPU-intensive computing workloads).

The high-speed nature of modern computer systems presents many challenges for engineers and modders alike. Multi-gigabit interfaces easily reach into the realm of radio-frequency (RF) electronics, where things like AC transmission line effects and differential signal routing become very important. This issue, coupled with the fact that most modern devices use very small components, makes it difficult for many electronics hobbyists to access these interfaces on their own. Although a modification like what I did is technically possible, I don’t consider it practical for an average electronics/computer enthusiast.

Future Ideas

One avenue I’ve thought about pursuing is a mod board that is soldered onto the Atomic Pi’s PCB, allowing a more robust connection for the PCIe lines; this would include signal integrity improvements like ESD protection and PCIe line redrivers to strengthen the signals to/from the SoC and peripheral(s). PCB design is currently beyond my scope of knowledge, but it would be a fun way to dive right into the subject matter.

eMMC Adventures, Episode 4: Recovering data from physically damaged BGA eMMC Flash storage chips

As seen on Hackaday!

The ball grid array (BGA) chip package has been instrumental in getting modern electronics to fit in smaller and smaller spaces, as it uses tiny balls of solder on the bottom of the package to make electrical connections, instead of copper leads on the edge of the chip package. This allows for hundreds of connections to be made in a small amount of PCB area, but their size also makes them very vulnerable to damage as well.

One common way for BGA chips to become damaged is called “pad cratering“, where the copper pad on the package’s substrate (basically a wafer-thin circuit board) separates and leaves behind a crater.

In the case of eMMC (Embedded MultiMediaCard), its package type is known as an FBGA (Fine-pitch Ball Grid Array), so the area of each pad is very small (0.4 mm in diameter!); it doesn’t take much at all to crater the pad – even gently removing solder with solder wick and generous amounts of flux can still cause damage! Most of the pads on an eMMC package are unused, but if any one of the DAT0, CLK, or CMD pads are damaged, then the chip is rendered unusable, even if the chip is placed into a socket for data recovery. If the DAT1-DAT7 pads are damaged, data recovery becomes much slower as the chip is forced to use fewer lines to transmit data (the MMC standard supports data over 1, 4, or 8 lines).

However, there is some hope. Many FBGA packages, including eMMC, use pads that are SMD (solder-mask defined); this means the solder mask is what defines the size of the pad, not the copper itself. Therefore, when the pad gets cratered, often there is a “halo” of copper left behind that still has a chance of getting an electrical connection.

The trick is how to to get a flat conductive area that a chip socket can use to get a reliable connection (without a copper pad, soldering is no longer an option). The eMMC socket adapter I used breaks out the eMMC onto an MMCplus-shaped PCB that can be inserted into any commercial SD card reader.

Filling in the Blanks

There are a few possible materials that can be used to restore contact area on a damaged BGA pad. One possible option is a silver-filled conductive epoxy, but I have not tested its efficacy; an additional consideration is that the volume of the filled-in crater might not be enough to get a filling with sufficiently low resistance for a good connection.

Another option is using solder paste, which I used in this case. Unfortunately, solder’s surface tension is our enemy when trying to fill in a flat area (it wants to form cohesive balls and therefore won’t want to stick to the ring of copper left around the crater), so a means of forcing the solder into the crater requires something flat, rigid and capable of handling the high temperatures experienced during soldering.

At first I tried Kapton (polyimide) tape, but that was a massive failure since it didn’t have the rigidity to stay flat when the solder paste began to melt, and the liquid flux rendered the adhesive useless.

The solution to the issue came in the form of glass. Specifically, I used very thin (0.15 mm thick!) glass “cover slips” normally used to prepare specimens for viewing under a microscope. These can be very inexpensive and one can obtain hundreds of them for a few dollars. The key is to fill the craters with the solder paste by using a knife as a squeegee, then placing the cover slip on top of the eMMC and reflowing it.

It took a few iterations for the pads on some of my eMMC chips to be restored sufficiently, as the volume taken up by the solder will be less than the paste and its accompanying flux. It doesn’t have to fill the entire crater – it just needs to be enough for the eMMC socket’s pins to make a solid connection.

Conclusion

The high-density nature of modern BGA chips is both a blessing and a curse. When trying to do data recovery from devices that use such tiny chips, such as eMMC or UFS Flash storage, sometimes the desoldering process is too much for the chip’s pads to handle. With some ingenuity (and thin glass), it might be possible to temporarily restore enough conductive pad area to get the data off with the help of an eMMC socket.

Performing safer AC line voltage measurements using isolated amplifiers

DISCLAIMER: AC line (mains) voltage is not something to be taken lightly! Attempting to safely handle line voltages while minimizing the risk of harmful or fatal electric shock is the main motivator for me to design and build this circuit. However, I am no electronics engineer and I definitely have no formal training on international standards pertaining to high-voltage safety. I accept no responsibility, direct or indirect, for any damages that may occur if you attempt to make this circuit yourself, including personal harm or property damage. Additionally, there is no warranty or guarantee, express or implied, on any content pertaining to this blog post (or any other posts).

UPDATE (November 19, 2018): Added isolation voltage ratings for the amplifier and DC-DC converter.

As seen on Hackaday!

Back in mid-2017 I won a Keysight DSOX1102G digital storage oscilloscope (DSO), a piece of equipment long on my wish list but never acquired until then. One thing I’ve wanted to be able to measure with an oscilloscope for a long time was the waveform of the AC utility (in other words, the wall outlet). However, doing so presents a very real risk of blowing equipment up or shocking yourself (and possibly other people). In order to prevent this, I needed a way to perform measurements on the AC line without being directly connected to it; in other words, I need galvanic isolation.

Isolation Methods

There are many different ways to achieve galvanic isolation. Common methods are the use of transformers and optocouplers, but they each have their own disadvantages.

Optocouplers (aka optoisolators) are a common component used for isolation, but they require a fair bit of external circuitry to work correctly – not to mention its current transfer ratio (CTR) varies with temperature and age, resulting in drifting measurements over time if a feedback circuit isn’t used. They also aren’t very fast; the common Sharp PC123 optocoupler has a cutoff frequency of only 80 kHz and a response time of 3-18 µs (but newer ones can be much faster).

Transformers don’t require active circuitry and would make stepping down voltages simple. However, their inductive nature causes issues when measuring waveforms with low-frequency content and sharp edges (like the output from modified sine wave inverters), resulting in inaccurate measurements due to the ringing and other distortion that the transformer creates. Additionally, common iron-core transformers aren’t very good at capturing frequencies above 20 kHz.

Solution: Isolation Amplifiers!

I settled on using an isolation amplifier to provide the necessary protection from the AC line and the oscilloscope. Several years ago TI provided sample kits for electronic motor drives, with one component being the AMC1200 isolation amplifier; this is the IC that I used in my AC waveform viewer – however, note that there are some limitations that I will address later in this blog post.

The AMC1200 uses TI’s digital capacitive isolator technology, using high-voltage SiO2 (silicon dioxide) dielectric capacitors on the chip itself for high voltage protection. The amplifier’s input is essentially digitized using a sigma-delta modulator, whose output is then sent digitally across the isolation barrier before being demodulated back into an analog output. It is rated for a working isolation voltage of 1200 Vpeak (848 Vrms), and a maximum isolation voltage of 4000 Vpeak (2828 Vrms), well above the typical voltages experienced on a 120V line.

AC Waveform Viewer Construction

 

As with most of my projects, my AC waveform viewer is built on FR4 fiberglass perfboard. The isolation components used are the AMC1200 isolation amplifier by Texas Instruments, and its corresponding power supply is the NXE1S0505MC isolated DC-DC module by Murata. It is rated for reinforced insulation up to 125 Vrms and basic insulation up to 250 Vrms, with a production-tested Hi-Pot rating of 3 kVDC. It does provide reinforced insulation at the voltages used in North America, but is still the weaker link in terms of maximum isolation voltage.

The AMC1200 features differential inputs and outputs, with a maximum input voltage of +/- 200 mV intended for use with low-resistance current shunt resistors.

One potential problem with perfboard is that the through-holes compromise the high-voltage isolation of the circuit (reducing the creepage and clearance distances), acting as multiple series spark gaps. The solution to this is similar to how isolation slots are used on commercial PCBs; that is, drill out the holes! This greatly increases the distance between each side of the circuit and improves the safety of the circuit.

The AC voltage input is scaled down to a manageable level via a resistive voltage divider. I used four high-precision 300kOhm resistors in series, plus a 1kOhm resistor placed across the AMC1200’s input terminals. Since the input is floating thanks to galvanic isolation, I decided to place the amplifier’s input in the middle of the voltage divider (that is, 600kOhm of resistance is present from the neutral and line terminals) to provide some extra protection from harmful electric shock; 120 Vrms / 600 kOhm = 0.2 mA to ground is the maximum amount of current that could possibly flow if I were to contact this floating node on the amplifier (this calculation assumes that my body has zero resistance, but human skin resistance is generally much higher than this). The voltage divider and the AC input terminals of my waveform viewer are further insulated with a layer of clear epoxy for even more protection.

The power supply terminals are fused with a 500 mA fuse before being protected by an SMAJ5.0A TVS (transient voltage suppression) diode and filtered with a 22 uF tantalum capacitor. The AMC1200’s output terminals are protected with 5.1 V Zener diodes at the terminal blocks for ESD and overvoltage protection.

Due to the floating nature of the waveform viewer, this essentially is a differential probe for my oscilloscope (and most high-voltage differential probes actually aren’t isolated!).

Circuit Limitations

No circuit is perfect, and mine is no exception. Here’s a few issues with my circuit that I’d like to address:

Isolation limitations

The AMC1200 only provides “basic insulation“; that is, it will provide protection from electric shock as long as its insulation barrier is not damaged (in other words, there is no redundancy). Circuits that have terminals that can be directly touched by humans needs “reinforced” or “double” insulation to be compliant with international regulations.

The NXE11S0505MC isolated DC-DC converter has a maximum working voltage of 125 Vrms for reinforced insulation and 250 Vrms for basic insulation, with a Hi-Pot test at 3 kVDC. This is lower than the AMC1200’s maximum voltage of 4000 volts, but these should still have enough headroom to keep me safe in the event of a mild voltage spike. It might prove useful to add some sort of surge suppression with a MOV (metal oxide varistor) or similar device.

The perfboard layout is also sub-optimal for the sake of isolation. Despite drilling out a row of holes to increase the creepage and clearance distances, it isn’t quite enough to meet regulations, as the clearance is only 3 mm and the the creepage isn’t much better, around 4 mm. This is still more than enough to withstand normal AC line voltages, but there is always a chance that higher-voltage transients will make their way onto the line and the isolation barrier needs to take this into account.

Output limitations

The AMC1200 provides a differential output that is centred (common-mode voltage) at 2.5 V, which can be an issue with single-ended inputs like that on an oscilloscope. I’ve worked around this by using a floating power supply, like a USB power bank, and connecting the oscilloscope’s ground terminal to the AMC1200’s Vout- pin. Also, the AMC1200 has a limited bandwidth of 60-100 kHz, but for the purposes of waveform monitoring it is sufficient; however, the amplifier’s noise and offset also negatively impacts performance as the high attenuation ratio essentially amplifies these values to the point where the AC waveform looks like a 2 Vdc offset and the noise level is so high that I need to use the averaging or high-resolution acquisition modes on an oscilloscope to get a clean waveform.

Power supply limitations

The NXE1 isn’t quite suited for such a low-power task as operating a single amplifier input. According to the datasheet, the output voltage can rise to twice the rated voltage if it is loaded with less than 20 mA. To combat this, I placed a 5.1 volt Zener diode across the output to provide regulation, which unnecessarily wastes power. Another regulated module like the NXF1 series would be a better choice, and the unit cost at one-off quantities isn’t a huge deal either.

Room for improvement

With this circuit working properly, I had plenty of ideas to make the second iteration even better:

Simultaneous voltage/current inputs

With the ability to measure current, I can perform measurements on the current draw of a device, allowing me to determine the power factor of a device.

True single-ended outputs

Most ground-referenced devices like oscilloscopes are not meant to handle differential inputs directly. Multimeters, especially battery-powered ones, are an exception.

Reinforced insulation rating on amplifier

The AMC1200 is only rated for basic insulation, so having an amplifier rated for reinforced insulation would provide greater electric shock protection. Alternatives like the Silicon Labs Si8920 could be a viable solution.

Waveform captures

 

 

Conclusion

Despite its ubiquity, AC power is a force that must not be taken lightly. Performing measurements on it, especially when viewing its waveform on a non-isolated oscilloscope, requires extreme caution as line voltage (especially in countries where 230 V is common) can easily injure or kill.

Using a voltage divider and isolation amplifier allows for safer measurements of the AC line without introducing distortion, especially compared to transformer-based implementations; this is critical when measuring the waveforms of modified sine wave inverters.

My implementation of an isolated differential probe helps protect me from electric shock when making measurements, while costing much less than a commercial high-voltage differential probe (for example, the CT2593-1, costs almost $330 USD on DigiKey).

But… which one would you trust more?

eMMC Adventures, Episode 3: Building a custom adapter to use cheap eMMC-based 32GB SSD modules

As seen on Hackaday!

While on my quest for more eMMC-based storage devices, I stumbled upon a few devices that piqued my interest: eMMC-based SATA SSDs! I found two models of particular interest: Dell had M.2 modules with a 2.5″ adapter, and HP had custom boards intended for use in cheap laptops (for example, the HP 14-an012nr). Although the former was easier for me to use (but not acquire), I will be focusing on the latter in this blog post.

Overview of HP 14-am/14-an Series SSD Module

Unlike Dell’s convenient M.2 modules, the cheaper boards from HP (costing about $12 USD when I purchased them) had a physical interface intended for use only with its intended host; despite using a SATA interface, physically it used a 10-pin FFC (Flat Flexible Connector, aka “ribbon cable”) since it was designed to work only with HP’s 14-am/14-an series of  low-cost laptops. The boards are labeled “DINERAMD-6050A2862201-DB-A01” and have a copyright date of 2016 in my case.

The BayHub OZ788WR2 Bridge Chip

These eMMC-based SSDs use a curious little chip, the BayHub OZ788WR2 (labeled 788WR2A on the chip itself). It is an SD/MMC-to-SATA adapter, with an SD UHS-II/MMCplus HS200 device interface and SATA II 3Gbps host interface. Apart from the brief description from the manufacturer, no other data is available for the chip (and even finding the chip online is basically impossible).

It’s a shame that so little is known about this chip (and that it’s so rare to find in actual devices), especially since high-performance SD-to-SATA adapters otherwise do not exist, as they use outdated SD-to-CompactFlash adapter chips that are limited to 25 MB/s speeds. If I had the engineering expertise, time, money, and ability to acquire these chips, I’d totally try to make an SD-to-SATA adapter with this chip… but alas, that will still remain a fantasy.

Step 1: Pinout Discovery

The single connector on the eMMC SSD is a ZIF FFC (Zero Insertion Force, Flat Flexible Connector), with no publicly available pinout or any other information. Perhaps this was why I got them for so cheap – apart from holding only 32 GB, nobody could even use them in their own computer even if they wanted to!

When trying to reverse engineer an unknown connector pinout, one needs to first look for ground pins. This is easily accomplished by using a multimeter with a continuity or diode test function, with the multimeter’s positive lead on a known ground point on the DUT (Device Under Test) – screw holes are often good candidates to look for. Ground pins will read as a short, but active IC and power pins will look like a forward-biased diode – appropriately 0.5 to 1 volt. I found 3 power pins (these are often grouped together on connectors for greater current capacity), 3 ground pins, and 4 SATA data pins. The data pins don’t show up on the multimeter test since they have series AC coupling capacitors, but they are easily located next to the connector and have clearly visible differential pairs leading to them.

The issue now is trying to find what order the SATA data pins are in, and how they relate to a regular SATA interface. As it turns out, the pinout is very simple: it matches the pinout of the 7-pin regular SATA interface! This makes sense as the SSD module and the laptop itself are designed to be cheap to manufacture.

Step 2: Building the Adapter (Take 1)

With the pinout known, the harder part is wiring up the connector. However, without a matching connector for the ribbon cable, I have no choice but to solder to it.

As I soon learned, not all flex cables are made of heat-resistant polyimide (aka Kapton) – this one melted before I could even tin the exposed leads. No matter, I’ll just use my trusty magnet wire and hook up the data and power lines! With the help of a salvaged SATA connector from a dead laptop drive, I was able to cobble together a crude adapter for the eMMC SSD board.

Although I didn’t end up taking a picture of the adapter, it wasn’t pretty. It also wasn’t very functional either – although the eMMC SSD board was able to identify itself (on my PC it showed up as a “BHT WR202HH032G E70211F5”), I couldn’t actually perform any data transfer without causing the OZ788WR2 to log hundreds of interface checksum failures (but hey, it supports S.M.A.R.T. data reporting!).

After some tweaking of the wire spacing, I was able to get the adapter stable enough to work, and encased it in hot glue for protection. It lasted a few weeks but eventually stopped working because one of the data wires broke off inside the blob of hot glue. Additionally, the outer contacts on the ribbon cable connector were peeling away from its plastic substrate. It was time for a rebuild.

Step 3: Building a Dedicated eMMC SSD (the teaser!)

Since I had multiple eMMC SSD boards, I took one, replaced the eMMC with a 128GB one from Samsung (the KLMDG8JENB-B041) and removed the ribbon cable connector. In its place, I used some very thin twinaxial cable from a dead MacBook and used a gutted CFast-to-SATA adapter for a shell. Stay tuned for that in another blog post!

Step 4: Building the Adapter (again!)

Much like my previous attempt, I used a salvaged PCB from a dead laptop drive, but left a lot of it instead of chopping it off directly at the connector. This particular one was a dead Samsung HDD, and it had one particular feature that I could use to make a stronger adapter: it had a TSOP footprint for the DRAM cache, which was just the right pitch for me to solder the ribbon cable to!

With a little help of my hot air rework station, I removed the DRAM cache and DC-DC converter, leaving the SATA AC coupling capacitors and the power input components (filtering choke and capacitors, and input overvoltage protection) behind.

After scraping off some solder mask, I soldered the SATA data wires and the ground wires surrounding them with very thin magnet wire, trying to keep the data pairs as close to each other as possible to minimize the chance of interference causing problems. The power wire was soldered to the power input components, right next to the input capacitor for better power delivery.

After checking with the multimeter that no short circuits were present, I hooked up an eMMC board and plugged it into my PC. It enumerated without issues, and running several tests including CrystalDiskMark, h2testw, and Hard Disk Sentinel’s random read test, amassing several hundreds of gigabytes in reads and writes with zero CRC errors logged in the S.M.A.R.T. data.

With everything checked out, I cleaned the circuit with isopropyl alcohol and covered the exposed end of the ribbon cable and the magnet wires with clear epoxy for protection. I also used a bit of epoxy on the flex connector to re-secure the lifted contacts to the substrate.

Conclusion

With a bit of wire and a circuit board from a dead HDD, I was able to reuse cheap eMMC-based SATA SSDs on computers that they weren’t meant for (and they even had copies of Windows 10 Home with extractable license keys! 🙂 ). Although not as fast as a modern full-fledged SSD, its relatively high 4K IOPS performance means it works well enough as a quick boot drive for running quick tests of OS installation without needing to sacrifice a bigger drive just for testing – and they consume less than a watt even when fully active!

Gaining access to the Windows CE desktop (and Doom!) on the Keysight DSOX1102G Oscilloscope

As seen on Hackaday!

TL;DR – Yes, the Keysight 1000 X-Series oscilloscope runs Doom! The journey getting there wasn’t easy, though.

The oscilloscope is one piece of equipment that any self-respecting electronics enthusiast should have. In short, oscilloscopes let you view the electronic waveforms of a circuit, and digital storage oscilloscopes (DSOs) are especially useful since they can reveal infrequent glitches on signals that an analog oscilloscope or a multimeter wouldn’t pick up.

DSC_2506

Keysight DSOX1102G oscilloscope

The subject of my blog post is the DSOX1102G from Keysight Technologies (formerly Agilent), which is part of their low-end offerings that still offer very good value compared to their competitors. As with most of their oscilloscope offerings, they run an embedded operating system called Windows Embedded CE 6.0 (AKA Windows CE or WinCE), but as with most WinCE applications, you almost never see the Windows interface since it’s hidden behind a custom user interface.

Stage 1: Awakening

When the Keysight 1000-X series of digital oscilloscopes first launched in early 2017, one reviewer on Hackaday noticed that certain data-saving features on the oscilloscope would cause it to crash and reboot, noting that a mouse pointer was visible on-screen for a few seconds before the scope rebooted. That post included a GIF of him attempting to save a file which caused the oscilloscope’s software to crash, but I noticed something peculiar in one frame of the video… there was a Windows taskbar visible right before the black error screen was displayed. Interesting…!

frame_060_delay-0.1s

Freeze-frame of oscilloscope on Windows CE desktop shortly before crashing (image courtesy of Hackaday)

When I won my oscilloscope thanks to Keysight’s Scope Month giveaway program, I didn’t think much of this for a couple months until I encountered the crash screen as well. In my case, I found that the Windows CE title bar was visible on top of the oscilloscope’s crash handler; dragging the title bar simply ghosted the window on-screen and doing this a few times more caused Windows CE itself to hang. This was a pretty infrequent occurrence, so when I encountered subsequent crashes I simply let the oscilloscope’s crash handler scan the internal file systems and reboot the operating system.

However, by this point I was intrigued and wanted to find a way to learn more about what’s going on with the underlying Windows CE operating system. I found that the oscilloscope’s USB port is rather sensitive to errors and simply wiggling a USB drive in in the USB port would crash the oscilloscope. This still wasn’t enough for me to gather enough information since I could not get the oscilloscope to do this consistently.

Thus begins my quest to access the Windows CE desktop. Let’s go!

At first I tried a software-only solution, attempting to craft a .ksx firmware update file (in reality it’s just a .cab archive) that would close the oscilloscope software and open Windows Explorer – no dice. The oscilloscope software would simply throw an error message saying that it couldn’t open the update file. As it turns out, this solution wouldn’t have worked even if I could get the oscilloscope to load the update file since the oscilloscope’s software doesn’t exit to the desktop during the firmware update process anyway. Having encountered my first significant roadblock, I set my curiosity aside and simply used the oscilloscope as, well, an oscilloscope for a while.

Stage 2: Looking Deeper

True to my curious nature, one day I decided to see whether the oscilloscope would read and write to 3.5″ floppy disks (or as the young ones might call them, “3-D printed save icons” 🙂 ) using a USB floppy drive – and it did! However, I noticed one very peculiar issue: the oscilloscope would crash on boot if I left the floppy drive in the USB port when I powered it on. Eureka! – I had finally found a means to reliably crash the oscilloscope.

Unfortunately, this is where I hit my second roadblock. This crash-on-boot phenomenon would only occur if the floppy drive was the only device plugged into the USB port; it would not crash if I used a USB hub between the oscilloscope and the floppy drive. This meant that I would have to be very fast in order to switch between the floppy drive and a USB mouse/keyboard. Racing against the clock to unplug the drive and plug in my combination USB keyboard/touchpad in the middle of the boot cycle was getting tiring and frustrating very quickly. I needed a better solution… a hardware solution.

20181015_011332

Custom USB A/B switcher I built to quickly swap USB devices

Using an old USB cable, a dead USB hub and a DPDT (dual-pole, dual-throw) pushbutton switch, I created a USB A/B switcher to make the process of switching between devices fast and easy. Using this method, I was able to try interacting with the Windows CE operating system for the fraction of a second that the taskbar was visible on-screen before the crash handler barged in to ruin my fun. With the magic of my Samsung Galaxy S9’s slow-motion video feature, I was able to determine that I could send keystrokes and Windows CE would act on them accordingly – even while the system was still on the splash screen! I was able to somewhat get information about the system by blindly entering keystrokes and seeing what the output was when the oscilloscope software crashed. Enter roadblock number 2…

The ability to very briefly interact with Windows CE was great, but it was still useless since I couldn’t actually take control of it before the oscilloscope’s crash handler rebooted the system. It appeared that the crash handler had a pretty tight lock on the operating system since no amount of mashing the Windows or Ctrl+Alt+Delete would let me back into Windows CE.

Stage 3: Getting a Foothold

Once again, my random curiosity would come in handy when I decided to use my old Sony Clie PEG-NX73V (a PalmOS handheld PDA dating back to 2003) as a USB drive; it had a Data Import feature which allowed the user to drag-and-drop files onto its memory card as if it were a removable disk.

Much like the USB floppy drive, a similar crash-on-boot effect occurred when I left the PDA plugged in when powering on the oscilloscope. But… unlike the floppy drive, the oscilloscope’s crash handler seemed to interpret the PDA’s file system as a corrupt firmware partition and asked to load a firmware update file from an external USB drive.

20180830_160455

Firmware update prompt

This wasn’t very consistent behaviour, as I found that sometimes the oscilloscope software would load anyway and resulted in a very strange Windows CE window appearing with a bright-cyan mouse pointer that ghosted the screen if I attempted to move it aside. However, in this limbo-like state I was able to drag the InfiniiVision oscilloscope software’s window aside and tried mucking around the operating system that way. However, the oscilloscope software was very aggressive and would regain focus every time I clicked behind its window; after some struggling with the system’s very strange colour palette I was able to somewhat fumble my way around the operating system. I couldn’t browse the file system since I couldn’t wrestle control from the oscilloscope software long enough to do so, but I was able to bring up the System Properties dialog box which revealed that the oscilloscope is based on Windows CE 6.00 and had 100 MB of RAM at its disposal.

20180830_012949

Accessing some Windows CE dialogs while InfiniiVision software still running – what a mess!

I then decided to browse the EEVblog forums, where community members there were hard at work trying to hack extra features into the oscilloscope. From there I found out that the firmware looks for a file called “infiniiVisionStartupOverride.txt” on the USB drive’s root, and if it does this, it will try to load the oscilloscope software from there. Although it appeared that the firmware didn’t actually load the software from the USB drive, it did stop the oscilloscope software from starting up and pulling control away from me. This is where things get really interesting – the crash handler opens an Explorer dialog box, and simply entering *.* into the file name textbox would let me begin browsing the oscilloscope and USB drive’s file systems! This is exactly what I needed to being controlling Windows CE! However, I was once again presented with another roadblock: the oscilloscope would reboot after 60 seconds which limited the amount of time I could browse the operating system.

 

 

After copying a few Windows CE tools like Windows CE Task Manager, I noticed that there were two interesting processes that were still running when the crash handler was visible, “recoverInfiniiVision.exe” and “processStartupFolder.exe”; it seemed like the recoverInfiniiVision process was indeed the crash handler that was preventing me from accessing Windows CE when the oscilloscope software crashed. Killing the processStartupFolder process with iTaskMgr (Windows CE Task Manager’s trial version can’t kill processes) was enough to prevent the oscilloscope from rebooting, and killing the recoverInfiniivision process left us with a blank Windows CE desktop – I was in! Unfortunately, I was unable to restore the taskbar which made navigating the OS quite cumbersome.

After creating a new folder on the desktop to open Explorer, I was finally able to do some real exploration (heh) of the oscilloscope’s file system. The tool Total Commander/CE was of great help since it also had a built-in text editor, which this version of Windows CE lacked.

20180830_110105

Browsing internal file system with Total Commander/CE (no taskbar just yet…)

Stage 4: Full Control

Now that I was able to see the blue sky desktop, all I needed for the full Windows CE experience was to bring back the taskbar. After a bit of Google-fu and browsing Stack Overflow, I whipped up some sample code to turn the taskbar back on. After opening this from Explorer, I had a full Windows CE desktop! YES – finally I had full control over the underlying operating system!

20180830_160618

Freedom at last – a full Windows CE desktop on my oscilloscope!

From there, I began looking through the file system to see what interesting utilities were in store. It appeared that the Command Prompt wouldn’t open at all when I tried to run it, but digging through the Registry with an editor, there was a Registry key at HKEY_LOCAL_MACHINE\Drivers\Console\OutputTo that was set to 0xFFFFFFFF. Setting that key to 0 was enough to make the Command Prompt visible on the desktop, so I created another small program to take care of this as well.

Things were looking good, so I created a batch file with all the commands required to kill the oscilloscope software, the startup folder handler, the crash handler; restore the taskbar; and re-enable the Command Prompt. However, this required I use my PDA to open the crash handler’s firmware recovery menu which prevented others from being able to reproduce this effect.

After some digging around, I found out that as soon as the splash screen appeared and the front panel LEDs began cycling, Windows CE would accept any keystrokes even without a software-crashing device present; pressing Windows and U caused the oscilloscope to hang as I was essentially opening the Start menu and selecting the Suspend option (which I guess the operating system had no means of regaining control since the oscilloscope only had a hardware power switch). With this in mind, I renamed my batch file to “a.bat” for easier typing, and tried to launch the batch file right in the boot process by pressing Windows, R (to open the Run dialog), then “\usb\a.bat” and finally Enter to run the batch file. This only caused the oscilloscope to remain at the splash screen but the Windows CE operating system was otherwise alive even if I wasn’t able to see what was actually going on. As it turns out, the crash handler is the key to making the Windows CE desktop visible, and all I had to do was add some lines in my batch file to launch and subsequently kill the crash handler. With the final touches added to my batch file, I can (semi-)automatically boot the oscilloscope right to the desktop, with just a USB drive, mouse and keyboard!

Stage 5: Yes, it runs DOOM!

Now that I had access to Windows CE, I could finally put an answer to the question “Does it run DOOM?” As a matter of fact, yes it can! It only took a year and a half after the oscilloscope’s launch, but this milestone had finally been achieved.

 

 

Stay tuned for my next blog post where I play around some more with this iconic video game – on a piece of hardware that was never intended to play games in the first place.

giphy

Doom in action at glorious 320×240, 256 colours! On an oscilloscope!

Downloads

I have released the files required for you to try this on your own scope – but be warned, I am not responsible if you brick your scope or anything else bad happens! I have only tested this on my DSOX1102G but I suspect that the rest of the 1000 X-series and other Keysight scopes that have the firmware recovery option may work too. The oscilloscope firmware is laid out such that the Windows CE file system is all in RAM and is not retained upon reboot, so any system-breaking changes to Windows will not brick the scope (the firmware files are found in hidden NAND Flash-resident directories that aren’t accessible from Explorer unless you enter them manually by name). Please leave a comment if you decide to try it yourself, and as to whether or not it works. 

You will need to format a USB drive with FAT or FAT32 and extract the .zip archive of my tool, Scope Liberator (click here!), to the drive’s root. Instructions are found in readme.txt.

If you’re interested in the source code for the support programs to re-enable the taskbar and local command prompt, I have made them available as well (they’re literally lines derived from sample code, but at least the icons for ShowTaskbar.exe and EnableLocalCmd.exe are original!).

Upgrading a passive Power over Ethernet splitter with 802.3af compatibility

As seen on Hackaday!

If you haven’t heard of Power over Ethernet, chances are you’ve experienced its usefulness without even knowing about it. Power over Ethernet (PoE for short) does exactly as the name implies: power is sent over the same Ethernet cable normally used for data transfer. This is often used for devices like IP phones and wireless access points (often you see these APs in restaurants and other establishments mounted to the ceiling to provide Wi-Fi access), as it is far easier, cheaper and safer to provide low-voltage power instead of wiring in AC power which requires the help of a licenced electrician.

 

Continue reading

eMMC Adventures, Episode 2: Resurrecting a dead Intel Atom-based tablet by replacing failed eMMC storage

As seen on Hackaday!

Recently, I purchased a cheap Intel Atom-based Windows 8 tablet (the DigiLand DL801W) that was being sold at a very low price ($15 USD, although the shipping to Canada negated much of the savings) because it would not boot into Windows – rather, it would only boot into the UEFI shell and cannot be interacted with without an external USB keyboard/mouse.

The patient, er, tablet

The tablet in question is a DigiLand DL801W (identified as a Lightcomm DL801W in the UEFI/BIOS data). It uses an Intel Atom Z3735F – a 1.33GHz quad-core tablet SoC (system-on-chip), 16GB of eMMC storage and a paltry 1GB of DDR3L-1333 SDRAM. It sports a 4500 mAh single-cell Li-ion battery, an 8″ 800×1200 display, 802.11b/g/n Wi-Fi using an SDIO chipset, two cameras, one microphone, mono speaker, stereo headphone jack and a single micro-USB port with USB On-The-Go support (this allows the port to act as a USB host port, allowing connections with standard USB devices like keyboards, mice, and USB drives).

Step 1: Triage & troubleshooting

The first step was to power on the tablet to get an initial glimpse into the issues preventing the tablet from booting. I was able to confirm that the eMMC was detected, but did not appear to have any valid MBR or file system; therefore, the UEFI firmware defaulted to entering the UEFI shell (which was of little use on its own as there is no on-screen keyboard available for it).

DSC_2191

DigiLand DL801W with UEFI shell

However, one can immediately notice there is an issue with the shell: how do you enter commands without an on-screen keyboard? The solution was to use a USB OTG (On-The-Go) dongle to convert the micro-USB type B port into a USB type A host port.

Using the shell commands, I tried reading the contents of the boot sector, which should end with an MBR signature of 0x55AA. Instead, the eMMC returned some nonsensical data: the first half of the sector had a repeating byte pattern of 0x10000700,  and the second half was all zeroes (0x00) except the last 16 bytes which were all ones (0xFF). The kicker was that this data was returned for every sector I tried to read. No wonder the eMMC was unbootable – the eMMC had suffered logical damage and the firmware was not functioning correctly.

After creating a 32-bit Windows 10 setup USB drive (these cheap low-RAM PCs often use a 32-bit UEFI despite having a 64-bit capable CPU), I opened Hard Disk Sentinel to take a deeper look at the condition of the onboard eMMC.

DSC_2198

Malfunctioning Foresee 16GB eMMC visible in Hard Disk Sentinel

The eMMC identified itself with a vendor ID of 0x65, and an MMC name of “M”. It reported a capacity of 7.2 GB instead of the normal 16 GB, another sign that the eMMC was corrupted at the firmware level.

DSC_2199

Foresee 16GB eMMC returning corrupted data

Using HDS, I performed a read scan of the entire eMMC despite its failed condition. The read speeds were mostly consistent, staying between 40 to 43 MB/s. A random read test revealed a consistent latency of 0.22 ms.

In order to assess whether the eMMC was writable in its current state, I ran a zero-fill and subsequent read scan. The eMMC appeared to accept writes but did not actually commit them, as HDS threw verification errors for all sectors.

After the tests in HDS, I decided to attempt an installation onto the eMMC to assess its writability. Windows Setup failed to create the disk partition structures, throwing an error message reading “We couldn’t create a new partition or locate an existing one”.

Step 2: Teardown & eMMC replacement

Since the onboard Foresee NCEMBS99-16G eMMC module was conclusively determined to be faulty, there was no point keeping it on the tablet’s motherboard. This also provided an opportunity to upgrade the eMMC to a a larger and faster one. Since this required the tablet to be disassembled, I decided to do a teardown of the tablet before attempting to replace the failed eMMC module (the teardown will be in a separate blog post when the time comes).

After removing the insulating plastic tape on the bottom of the PCB, I masked off the eMMC with some kapton tape to protect the other components and connectors from the heat of my hot-air rework station. With some hot air and patience, the failed Foresee eMMC was gone. This also revealed that the eMMC footprint supported both the 11.5×13 mm and 12×16 mm sizes, but the 12×16 mm footprint did not have the extra 16 solder balls for reinforcement (most eMMC balls are unused so their omission had no negative functional effect).

DSC_2215

Foresee eMMC removed from DL801W’s motherboard

Instead of a barely-usable 16 GB of eMMC storage, I opted to use the Samsung KLMBG4GEND-B031 – a 32 GB eMMC 5.0 module. This chip boasts more than 2000 IOPS for 4K random I/O, which should be a boon for OS and application responsiveness.

DSC_2219

Replacement Samsung KLMBG4GEND-B031 eMMC installed

A little flux and hot air was all I needed to give the 32 GB eMMC a new home. Time to reassemble the tablet and try installing Windows 10 again.

Step 3: OS reinstallation

After spending a few minutes cleaning the board and reinstalling it in the tablet, it was time to power the tablet back on, confirm the presence of the new eMMC and reattempt installing Windows.

DSC_2221

Installing Windows 10 from USB drive via USB-OTG adapter

The eMMC replacement proved to be successful; within minutes, I was off to the races with a clean installation of Windows 10.

Conclusion

DSC_2223

DL801W restored, running Texas Instruments’ bqSTUDIO software

This was a pretty fun project. With some electronics and computer troubleshooting skills, I had a tablet capable of running desktop Windows programs. Its low power consumption and USB host capabilities made for a great platform to run my Texas Instruments battery hardware and software without being tethered to my desktop.

However, I was not finished with this tablet. The 1 GB of onboard RAM made Windows painfully slow to use, as the CPU was constantly bogged down performing memory compression/decompression. The 32GB of eMMC storage I initially installed began feeling cramped, so I moved to a roomier 64GB (then 128GB) eMMC.

I won’t go into the details of how I upgraded the RAM in this post, as it’s a long story; simply put, soldering the RAM ICs was the easy part.

eMMC Adventures, Episode 1: Building my own 64GB memory card with a $6 eMMC chip

As seen on Hackaday!

There’s always some electronics topic that I end up focusing all my efforts on (at least for a certain time), and that topic is now eMMC NAND Flash memory.

Overview

eMMC (sometimes shown as e.MMC or e-MMC) stands for Embedded MultiMediaCard; some manufacturers create their own name like SanDisk’s iNAND or Hynix’s e-NAND. It’s a very common form of Flash storage in smartphones and tablets, even lower-end laptops. The newer versions of the eMMC standard (4.5, 5.0 and 5.1) have placed greater emphasis on random small-block I/O (IOPS, or Input/Output operations per second; eMMC devices can now provide SSD-like performance (>10 MB/s 4KB read/write) without the higher cost and power consumption of a full SATA- or PCIe-based SSD.

MMC and eMMC storage is closely related to the SD card standard everyone knows today. In fact, SD hosts will often be able to use MMC devices without modification (electrically, they are the same, but software-wise SD has a slightly different feature set; for example SD cards have CPRM copy protection but lack the MMC’s TRIM and Secure Erase commands. The “e” in eMMC refers to the fact that the memory is a BGA chip directly soldered (embedded) to the motherboard (this also prevents it from being easily upgraded without the proper tools and know-how.

When browsing online for some eMMC chips to test out, I found a seller that had was selling 64 GB eMMC modules for $6 Canadian per pop; this comes out to a very nice 9.375 cents per gigabyte (that’s HDD-level pricing right there!). With that in mind, I decided to buy a couple modules and see what I could do with them. A few days later, they arrived in the mail (and the seller was nice enough to send three modules instead of just two; the third module’s solder balls were flattened for some reason).

Toshiba eMMC Module

Toshiba THGBM4G9D8GBAII eMMC 4.41 modules

Toshiba THGBM4G9D8GBAII eMMC 4.41 modules

The Flash memory I used is a Toshiba THGBM4G9D8GBAII. According to a Toshiba NAND part number decoder:

  • TH: Toshiba NAND
  • G: Packaged as IC
  • B: Vcc (Flash power supply) = 3.3 V, VccQ (controller/interface power supply) = 1.8 or 3.3 V
  • M: eMMC device
  • 4: Controller revision 4
  • G9: 64 GB
  • D: MLC NAND Flash
  • 8: Eight stacked dice (eight 8 GB chips)
  • G: 24nm A-type Flash (appears to indicate Toggle Mode interface NAND)
  • BA: Lead-free and halogen-free
  • I: Industrial temperature grade (-40 to 85 degrees Celsius)
  • I: 14 x 18 x 1.2 mm BGA package with OSP (Organic Solderability Preservatives)

Given the low, low price of the eMMC chip, I had to make sure that I wasn’t given counterfeit Flash memory (often fake flash would have only 4 or 8 actual GB usable, with most of the address space looping over itself, causing data loss with extended usage). This involved find a way to temporarily connect the eMMC to my computer. I had a USB 2.0 SD/MMC reader on hand as well as a laptop with a native SD host interface, so now all I needed to do was break out the eMMC signals on the BGA package so that I can connect it to the reader.

eMMC Pinout… or is it Ball-Out?

There are plenty of pinouts for eMMC on the Internet, but they all show the pinout for a top view. Since I’m not soldering the eMMC to a PCB, I need to get a bottom view. I took a pinout diagram from a SMART Modular Technologies eMMC datasheet, rotated it to a landscape view, flipped it vertically, then flipped each row’s text in order to make it readable again. I then copy-pasted this into PowerPoint and traced out the package and ball pinouts. This allowed me to colour-code the different signal and power lines I’ll need to implement, including the data, clock, command and power lines. Curiously enough, one of the ground pins (VssQ, or controller/MMC I/O ground) was not a ground pin like the standard required; because of this, I decided to leave that pin open-circuit. Additionally, there were several pins that were not open-circuit, but did not have a known purpose either (these are probably used as test pads for the internal NAND Flash interface – perhaps they could be reused as raw NAND with the right controller, but the exact purpose of these pads will need to be reverse engineered).

Toshiba THGBM4G9D8GBAII eMMC pinout (solder balls facing up)

Toshiba THGBM4G9D8GBAII eMMC pinout (solder balls facing up)

eMMC Reader: Take 1 (Failed!)

For the first reader, I cut open a microSD-to-SD adapter, exposing the eight pins inside. I soldered a cut-up UDMA IDE cable and glued them in place. Despite my careful work, I still melted a hole through the thin plastic shell of the adapter; thankfully this did not affect the adapter’s ability to be plugged in.

I used double-sided foam adhesive tape and a piece of perfboard to create a small “test bed” for the eMMC module. Using some flux, solder wick, and a larger soldering iron tip, I removed all the (lead-free) solder balls on the center of the IC and replaced them with leaded solder bumps to make soldering the tiny 40-gauge magnet wire easier.

After bringing out the minimum wires required (VCC/VCCQ, GND, CLK, CMD, and DAT0 for 1-bit operation), I soldered the wires of my quick SD adapter, and plugged it into the SD card slot of a (very old) Dell Inspiron 9300.

Calling this board’s operation flaky doesn’t do it justice. It would fail to enumerate 9 out of 10 times, and if I even tried to do anything more than read the device capacity, the reader would hang or the eMMC would drop off the SD/MMC bus and show an empty drive in Windows. It was clear I had to do a full memory card “build” before I could verify the usability of the eMMC Flash memory.

eMMC in an SD Card’s Body: Take 1 (Success… half of the time)

I had a 16 MB (yes, megabyte) SD card lying around somewhere, but as usual, I couldn’t find it among all the clutter around my desk and workspace. Instead, I found an old, slow Kingston 2 GB SD card that I felt would be a worthy “sacrifice” since it was an older type that still had a thin PCB inside (most SD cards nowadays are monolithic, which means it’s one solid chunk with a few pads exposed). After opening up the case carefully with an Exacto knife, I wiggled out the old PCB. I desoldered the orignal 2 GB NAND Flash, and began work on breaking the SD card controller from the PCB as it was a chip-on-board design. It took a while, but I was able to ensure that none of the old SD card hardware would interfere with my rebuild.

I removed the eMMC from the board I made previously, and tested the thickness of it to ensure that it would fit inside the SD card case. It did, although the 0402 surface-mount decoupling capacitors I intended to install would cause a few bumps to be visible through the thin plastic SD card casing.

With my eMMC and SD card pinouts on hand, I used a small bead of epoxy to affix the eMMC to the PCB, balls-side up. I used magnet wire to connect the data lines (4 wires for 4-bit operation which is the maximum that the SD standard supports), and used the unused pads on the eMMC as a kind of prototyping space where I could install ceramic capacitors as close to the module as possible. I used a 0.1 µF 0402 size ceramic capacitor across the VDDi (eMMC internal regulator) and a neighouring GND pad. The rest of the power pads were wired in parallel with a few extra 0.1 µF capacitors added. I made use of the existing three 1 µF capacitors on the PCB as both extra decoupling and connection points for VCC and VCCQ. To prevent shorting of the inner CMD and CLK pins, I only removed the enamel coating from the magnet wire at the very end so I could solder them but avoid the issue of shorting those pins against the other signal and power lines. I then soldered these wires to the terminals on the other side of the PCB.

After spending about ten minutes wriggling the PCB into the SD card casing without damaging the wires, I used a multimeter to ensure all the pins were connected (use a multimeter in diode mode, with the positive lead connected to ground – any valid pins should read ~0.5 volts), and also ensured that there were no polarity reversals or shorts on the power pins.

Now… the moment of truth. At this point my USB 2.0 card reader still wasn’t cooperating with me, so I tried the only other ‘fast’ reader I had at the time – an SD to CompactFlash adapter.

To my relief, I finally got a (mostly) usable card. It appears this particular model has been pre-formatted with FAT32. Viewing the MBR in Hard Disk Sentinel shows nothing notable, apart from the fact that it’s pretty blank and is indicative that it wasn’t formatted for use as a PC boot medium.

Things began to fall apart after I tried running speed tests, as the card would hang if it experienced a lot of write activity at once. I suspected this was a power supply-related issue, so I modified my layout to add more capacitance. For good measure, I added 56 ohm termination resistance for the DAT0-4 data lines, using a small resistor network harvested from an old dead MacBook motherboard.

After these modifications, performance was much, much better. Now that the card was usable, I could finally run some speed tests.

eMMC in an SD Card’s Body – This time, with more feeling decoupling!

After adding several 100 nF and 1uF 0402-size ceramic capacitors on the eMMC package, I was able to get a stable card that could be read by (most) SD card readers. As I was rather anxious to get a decent benchmark from the eMMC, I decided to forego the cheaper Amazon Prime route, and go to my local PC parts store to buy a USB 3.0 card reader – the Kingston FCR-HS4.

After placing the eMMC and SD card PCB back into its plastic casing, I was relieved to see that Windows immediately recognized its presence. All I had to do then was open CrystalDiskMark and run the benchmark. Drum roll please…

Toshiba THGBM4G9D8GBAII/064G4A benchmark in CrystalDiskMark

Toshiba THGBM4G9D8GBAII/064G4A benchmark in CrystalDiskMark

Although I was happy to get a usable benchmark score, my belief that all eMMC devices inherently had better 4K random I/O speeds than their SD counterparts was immediately busted. My guess is that random I/O wasn’t considered to be a priority until eMMC 4.5 or 5.0, and my eMMC modules are only version 4.41.

eMMC module listed as version 4.41

eMMC module listed as version 4.41

After the speed test, I ran the card through the popular Flash memory testing tool h2testw to make sure that I was not given a counterfeit device.

H2testw showing flash memory is good

H2testw showing flash memory is good

Excellent – it’s a genuine device. Despite the slower performance than expected, I’m happy that the memory capacity is as it should be.

“eMMC identification and CSD data, please”

As is the case with any USB memory card reader, I cannot access any of the eMMC device information (that is, the CID/Card Information Data and CSD/Card Specific Data registers). I took a spare SSD from my collection and got a quick Windows 10 installation running on one of my laptops that had a native SD host interface.

eMMC identified as Toshiba 064G4A MMC

eMMC identified as Toshiba 064G4A MMC

Interesting. The eMMC identifies itself as a Toshiba 064G4A MMC card. Googling that information brought up literally zero information, so it appears I’m the only one to have found (or published) any information about it. Although eMMCs support some degree of S.M.A.R.T. health reporting like mainstream SSDs and HDDs, no (easily-available) software (for Windows at least) is available to read it.

Linux has the ability to report the CID and CSD data as long as the native SD host interface is used, as opposed to a USB card reader.

CID: 11010030363447344100151344014e00
CSD: d00e00320f5903ffffffffef96400000
date: 04/2011
enhanced_area_offset: 18446744073709551594
erase_size: 8388608
fwrev: 0x0
hwrev: 0x0
manfid: 0x000011
oemid: 0x0100
preferred_erase_size: 8388608
prv: 0x0
raw_rpmb_size_mult: 0x2
rel_sectors: 0x10
serial: 0x15134401

With the help of Gough Lui’s CID and CSD decoders, I was able to gain some more information about the eMMC device, but not too much as the information I was originally interested in was already collected by this point.

Out of the Reader and Back Into the (CF) Adapter

Now that I know what the eMMC is capable of, I decided to try putting it back into my SD-to-CF adapter and doing another benchmark.

eMMC in FC-1307A SD-to-CF adapter. Note the limited performance of this chipset.

eMMC in FC-1307A SD-to-CF adapter. Note the limited performance of this chipset.

This test highlights one of the biggest limitations of the FC1306T/FC1307A chipset that so many adapters use: their performance is limited to a maximum of 25 MB/s per channel. Good thing I purchased that USB 3.0 reader…

Conclusion

This was quite the learning experience. I not only learned that eMMC flash memory does not necessarily have the near-SSD performance that the latest devices offer, but I learned how to “exploit” the unused pads of a BGA device as a sort of “prototype area” for soldering small components onto.

Did I save any money by rolling my own Flash storage device? Absolutely not – given how much time I spent on this, if I paid myself minimum wage ($12 per hour where I live), I could have bought at least three higher-performance 64GB SDXC cards with none of the frustration of trying to adapt an embedded memory device as a removable memory card. But where’s the fun in that? 🙂

A Temporary Hold: Creating Li-Ion battery holders with prototype boards and pin headers

As seen on Hackaday!

Lithium-ion batteries are great. They have high energy density, are lightweight, and in the case of many portable devices, they can be easily swapped in and out. One problem with prismatic (the types you often find on cell phones that have a set of flat contacts on one end of the battery) packs is that they’re all custom; the cell may be standardized but the pack it’s in is often proprietary to a certain make and model. Sure, there are “universal” holders out there, but they provide poor electrical contact at best. Since I need a secure electrical connection when using my battery fuel gauges, I sought to create a more sturdy holder for the batteries I have lying around.

The construction of the holder is pretty simple. A strip of female pin header (I used a single-pin-width header but a double-width one can be used for greater mechanical strength) is used as an end-stop for the battery, and a right-angle pin header is used to create contact with the battery’s terminals and to provide the physical “clamping” needed to create a good connection. The right-angle header can be bent and soldered into place to adjust the holder to the particular cell you’re using. Additionally, be sure to use some high-quality FR4-based boards as the brown-coloured paper/resin-based boards won’t have as good resilience and strength, and probably won’t be plated through either (this improves the structural integrity of the holder since the pin headers will be under a bit of physical stress).

For connections, I have a 2-pin header (physically a 3-pin header with one removed to denote polarity) and a set of screw terminals. These are wired up using a flat ribbon “wire” used to connect solar cells together as they can handle several amps and come pre-tinned with solder.

This sort of setup can be adapted to nearly any commercially available prismatic battery, provided it uses a flat contact area on the sides.

Convenient chips, inconvenient packages: Making use of the Texas Instruments bq27421-G1 lithium-ion battery fuel gauge chip

As seen on Hackaday!

I ordered some sample chips from TI a few weeks ago, most of them being lithium-ion battery “fuel gauge” chips. These chips are used in electronic devices to determine exactly how much energy is in the battery, and if the chip’s sophisticated enough, provide a “time until empty” prediction.

The bq27421 from TI is packaged in a tiny 9-ball grid array, packaged as a wafer-level chip scale package (WLCSP). This means there is no epoxy covering like normal ICs, making for a compact design that’s a good thing for space-constrained applications like modern cell phones. I’ll talk about this chip later on in this post.

The tiny BGA package means that prototyping with these chips is difficult if not impossible, depending on how large the chip is that you’re working with. The bq27421 is about 1.6 mm x 1.6 mm, which is less than 1/3 of the size of a grain of rice. No way you’d be able to put that on a breadboard… right?

2013-06-14 15.51.58Well, you can, with a small breakout board, some magnet wire, epoxy (a bigger deal than you might initially think), patience and steady hands. I mounted the chips in what I call a mix between dead-bug (where the contacts face up as if the chip was like a dead bug on the ground) and chip-on-board construction (where the chip is glued directly to a board, wire-bonded and then covered in epoxy). I used some SOIC-to-DIP boards from DipMicro Electronics (link). I often use these boards when doing work on prototyping board since using these surface-mount parts reduce the board’s height compared to using actual DIP packaged chips (which are much less common for modern ICs anyway).

The chip is first affixed to the breakout board using a small amount of epoxy and allowed to cure for several hours. The epoxy, from what I’ve found, is crucial to your success; superglue and other adhesives won’t stand up to the heat of a soldering iron, and if it loosens you can end up ruining your chip and wasting your time spent working on it.

After letting the epoxy cure, I then prepare the bond pads around the chip. I place a liberal amount of solder on each pad to allow easy connection with the iron later; I want to minimize the stress on the tiny 40-gauge magnet wire because once the connection is made, the solder ball that the chip came with won’t be as easy to solder to the second time around.

Next up is the actual soldering process. I created a pinout for the board in PowerPoint to help plan out how I’ll solder the wires. After tinning a long length of 40-gauge magnet wire, I then solder the wire first to the solder ball on the chip, then solder the other end to the pad I previously put solder on. To minimize the stress on the wire afterwards, I use a small utility knife to cut the end of the wire where the pad is. I then complete this for the rest of the contacts. This took me an hour and a half the first try, but took me about 20 minutes the second time around. Also, for my second try, for the BAT and SRX pins, which carry the full current for any loads connected, I used 30-gauge wire-wrapping wire to allow a bit more current-carrying capacity. It probably is overkill since the maximum current rating for the bq27421 is 2 amps continuous, but I felt a bit more at ease connecting the pins this way.

After checking for short and open circuits with a multimeter I then placed headers onto the board and put it into my “evaluation board” that I created just for this chip. Using an EV2400 box from TI, used to connect to their vast range of battery-management chips, I connect the box to my PC and run their GaugeStudio software to verify that the chip works.

… and it does, like a charm! I was able to communicate with the chip and also view its operation in real-time.

One thing that was causing me trouble before was that after removing the battery and putting another one in, I found that the gauge chip sometimes wouldn’t be recognized by the PC. Being unsure why it was doing this, I dug through the reference manual, and found one tiny part in the manual that showed me why it wasn’t working consistently.

gpoutThe GPOUT pin was left floating on my board, and the chip requires a logic high signal before it starts up. This brings back memories of my digital electronics class in college; these floating inputs can cause all sorts of trouble if you’re not careful, and in this case, it was mentioned only once in the reference manual. After using a 1 megohm resistor to pull up the pin, the chip worked flawlessly. Now that I verified that the chip was working, I mixed up some more epoxy and covered the chip, making sure that the bond wires and chip were covered to prevent damage.

After all that, I had a couple working highly-advanced battery gauges that I could fool around with, and also learned a couple things about deadbugging SMT components and also the basics of chip-on-board construction.