About BTS Clocks
Many of the problems getting phones to "camp" in both OpenBTS and OpenBSC have been related to frequency accuracy. To understand this problem, you have to understand clock technologies and you have to understand how a typical GSM phone acquires a signal.
Let's start with this, from GSM 05.10 Section 5.1:
The BTS shall use a single frequency source of absolute accuracy better than 0.05 ppm for both RF frequency generation and clocking the timebase. The same source shall be used for all carriers of the BTS. For the pico BTS class the absolute accuracy requirement is relaxed to 0.1ppm.
First: What is Supposed to Happen
An error of 0.05 ppm is 45 Hz in the low bands (850/900) and 90 Hz in the high bands (1800/1900). That is VERY accurate, the kind of accuracy you get from a GPS-disciplined VCTCXO (temperature-compensated voltage-controlled crystal oscillator) or an OCXO (oven-controlled crystal oscillator, a $100-$200 part). Doppler shift in the direct path from a 150 km/h car or train is on the order of 0.15 ppm, so the necessity of this kind of accuracy is questionable, but that's what is in the spec. Given Doppler effects, the observed frequency difference between two carrier-grade basestations is a few hundred Hz, worst case.
The typical GSM handset has a medium-quality VCTCXO. On a cold start, not locked to any outside clock, this clock has a an accuracy of around 20 ppm, or about 18 kHz in the low bands (850/900) and about 36 kHz in the high (1800/1900) bands. So, starting "cold" with no information, the phone runs a time-consuming frequency search over the whole possible drift range of its own clock and continues this search until it finds a beacon signal from a BTS. Once the phone finds a beacon, it uses the carrier from the BTS to correct its own local clock by adjusting the control voltage on the VCTCXO. From that point on, the VCTCXO is just as accurate os the BTS clock as long as it is receiving the BTS signal. If the BTS signal is lost, the handset's clock will drift within some know worst-case rate. Knowing how long its local clock has been drifting, the handset can calculate the worst-case drift and use that information to narrow next frequency search.
So, you turn on a handset from "cold", it does a big frequency search until it finds a BTS. If it looses that BTS signal, it searches for another, but this time it does a much smaller frequency search, knowing that the signal from the next BTS will be within a few hundred Hz of its expected frequency and that its own local clock has not drifted much since last seeing a beacon. If the handset finds another beacon during this small search, it will stop searching. (That's critical for the next part of this discussion.) If the handset fails to find a beacon in the small search, it will widen the search range or a multiband phone might try a different band.
Second: What Goes Wrong When You Use an Inaccurate Clock
Now, suppose your BTS uses a simple XO (crystal oscillator) clock, which produces an error on the order of a several kHz on your RF carrier. This is the case for OpenBTS with a "stock" USRP and it is also the case with the the Siemens BS-11 (used by OpenBSC) when it is locked to the clock from a typical ISDN-grade E1 interface card. This kind of BTS can fail in three possible ways, depending on just how bad the clock error is and how the system is being used:
Effect of Large Frequency Errors (Several kHz or More)
The first type of failure is where the BTS XO error is so large that your BTS beacon falls completely outside the handset's "big" search range. The handset simply never finds the BTS signal. This is the error Fabian Uehlin found when he first tried to operate OpenBTS in the 1800 band. He fixed it by using an external clock with much better accuracy.
Effect of Modest Frequency Errors (500 Hz to a Few kHz)
The second type of failure is when your BTS RF carrier is within the "big" search range but differs from the local "real" networks by more than a few hundred Hz. In this situation, the handset will either see your BTS or it will see the "real" network, but *not both*. Whatever system the handset sees first will control its clock and make it blind to the other. This kind of failure was discussed in detail in the OpenBSC e-mail list in Spring 2009. It has not been discussed much in the OpenBTS list, but it is safe to assume that it happens if you try to run the BTS in the same band as your local GSM carriers or try to work with multi-band phones. If I am not mistaken, the OpenBSC people fixed this problem by realizing that the VCOCXO in the BS-11 is much better than the XO on their standard E1 card -- if they just leave it alone. In OpenBTS we avoid this problem by operating in non-local bands and disabling other bands in the handsets so that they never see any other network. (OpenBTS can do that because unlike OpenBSC our radio is mostly software and very flexible.)
The XO errors of the BTS and the handsets will vary with age and temperature, so the failure behavior will be different for every handset at any given time and will vary from hour to hour as things warm up and cool down. That can make it all seem very mysterious and can make diagnosis difficult.
When You Try to Run a Multi-BTS System
Normally, a handset receives a neighbor list from its serving BTS and constantly monitors the signal levels from the beacons of these neighbors. The key feature used for this monitoring is the extended training sequence (XTS) of the synchronization burst. This XTS has a duration of 64 symbols (roughly 0.25 ms). If we search for this XTS with a matched filter, the performance of that matched filter will degrade rapidly for frequency offsets greater than about 1/(4*0.25 ms), or about 1 kHz, because at higher frequencies the relative phases of the XTS and the matched filter will drift by more than 90 degrees over the correlation period. So neighbor monitoring (and therefore mobility itself) starts to fail for per-BTS carrier offsets of more than 500 Hz.
A 500 Hz error is 0.5 ppm in the low bands and 0.25 ppm in the high bands. While this is an easier requirement than the 0.05 ppm given in the spec, it is much tighter than could be provided by a simple XO. This accuracy can be provided by an OCXO or a good quality VCTCXO with regular calibration.
Solution to All Problems: Use a Good Clock
You can fix both of these clock problems completely replacing the XO clock in the USRP with something much more accurate, like a true OCXO or a very high-quality VCTCXO with an automated calibration procedure. Cheap options for a stable clock source are
- ClockTamer, which can be tunned to sub-50ppb accuracy manually, or locked to GPS for 50ppb stability.
- Several OpenBTS uses have also recommended the kit "FA-SY 1" from Funkamateur, available for about 40€, for desktop testing.
- Get a Range Networks dev kit or BTS unit. These products are specifically built to run OpenBTS and use a software-trimmable TCVCXO.
Third: When You Violate Clocking Relationships
Getting back to 05.10 Section 5.1, the GSM specifications dictate that the RF carrier and the symbol clock in the BTS be derived from a common source.
In the handset, the symbol clock is disciplined by the BTS RF carrier, just like the carrier clock. So, if your BTS RF carrier is in error by N ppm, the handset symbol clock will also be in error by N ppm. That's OK if the BTS symbol clock has exactly the same error, which it WILL if you derive everything from a common clock that way the specification tells you to.
But suppose you try to be clever. You know that once your equipment is warmed up, your BTS XO is consistently in error by +11 ppm, giving an RF carrier error of +10 kHz. So you deliberately de-tune your RF carrier by -10 kHz. That fixes the problems described in the previous section, but creates a new problem in the symbol rate clock. The standard GSM symbol clock is 270.833333 kHz. Your BTS XO is really off by +11 ppm, so your BTS symbol clock is really 270.836312 kHz. The handset locks to your BTS RF carrier, which is now spot-on its specified frequency, so the handset generates a correct internal symbol clock of 270.833333 kHz. So now the BTS and handset symbol clocks are slipping against each other at a rate of 11 ppm, or about 3 symbols per second.
Another variation on this problem is if you ignore the spec and use different clock devices for your RF carrier and your symbol clock. For example, suppose you have two 0.5 ppm OCXOs. Their relative fractional difference will be about 0.5 ppm, giving a drift of about one symbol every 7 seconds or so. (I've actually see this done before, too.)
Different handsets respond to the slipping symbol clock in different ways. In our experience, Nokia DCT3s just deal with it, apparently re-syncing on every frame, so if we only use DCT3s for testing, we may never realize that it is a serious problem. A Nokia DCT4 may camp to the beacon briefly, but will abort a transaction when the clock slips. In some handset designs, the symbol slipping interacts with the closed-loop timing advance, making that control loop unstable. Again, the error seems mysterious because different handset models respond in different ways and the effect will very with temperature as the XO in the BTS drifts around.