Dr. Daniel Lau

Just another Engineering Blogs weblog

The Science Behind Kinect

Dr. Daniel L. Lau

Abstract

The original Microsoft Kinect camera is having a profound impact in all areas of computer vision beyond just gaming, but with the introduction of the Kinect 2.0 sensor, there is a lot of confusion about exactly how does this thing work. And how does the manner in which the device works affect how and where it can operate effectively? In this article, I review the basic methodologies of the two profoundly different sensors and show that the new sensor isn’t just a higher resolution camera but a dramatically improved approach to 3-D sensing.

Summary

Machine vision is the area of research focused on measuring physical objects using a video camera. Sometimes confused with the broader area of computer vision, machine vision is typically associated with manufacturing and detecting defects in parts on an assembly line. Computer vision is interested in any and all applications that involve teaching a computer about its physical surroundings via imaging sensors. In either case, 3-D imaging sensors such as Microsoft’s Kinect camera are having a profound impact because they solve a host of problems associated with perspective distortion in traditional 2-D cameras at an absurdly low price. Perspective distortion refers to how things look smaller as they get farther away.

Figure 1: Illustration of the stereo-imaging method.
Slide1

Of the many methods of measuring depth with a camera, the Kinect 1.0 sensor falls within a broad range of technologies that rely on triangulation. Triangulation is the process used by stereo-imaging systems which is how the human visual system (i.e. two eyes) works. The process is illustrated in Fig. 1 where I show two points in space, at varying distances from the camera. Looking at the two images as viewed by the stereo-camera pair, the blue sphere being closer to the cameras has a greater disparity in position from camera A to camera B. That is, the blue sphere appears to move almost three-quarters of the cameras’ fields of view while the red sphere moves only half of this distance. The disparity in travel distance between the red and blue spheres is a phenomenon known as parallax such that closer objects produce greater degrees of parallax than distant objects.

Of course, the Kinect 1.0 sensor doesn’t have two cameras performing triangulation. Instead, it relies on triangulation between a near-infrared camera and a near-infrared laser source to perform a process called structured-light. Structured light is one of the first methods of 3-D scanning where a single light stripe mechanically sweeps across a target object, and from a large sequence of images taken as the stripe sweeps across the target, a complete 3-D surface can be reconstructed. Figure 2 shows how this would work for a single frame of video where the position of the laser appears to move left and right with depth such that the more to the right, the closer to the camera the target surface. Note that with just the single laser stripe, a single frame of video can only reconstruct a single point of the target surface per row of the image sensor. So a 640×480 camera could only reconstruct, at most, 480 unique points in space. Hence, we need to sweep the laser so that we can generate many points in each row.

Figure 2: Illustration of the structured-light imaging method using a single laser stripe.
Slide2

Obviously for our sweeping laser stripe, we would need the target object to stay still during the scanning process. So this doesn’t really work in real-time systems like the Kinect 1.0 camera when we want to measure a moving subject. Instead, the Kinect 1.0 sensor makes use of a pseudo-random dot pattern produced by the near-infrared laser source that illuminates the entire field of view as illustrated in Fig. 3. Having this image, the processing hardware inside the camera then looks at small windows of the captured image and attempts to find the matching dot pattern in the projected pattern.

Figure 3: Captured IR image from Microsoft Kinect 1.0 Sensor.
Slide3

Of course, you’re probably wondering why the perspective distortion phenomenon doesn’t make dots look smaller and closer together as the reflecting surface gets farther away from the camera. It doesn’t because the camera and the laser projector are epipolar rectified. That is, the camera and projector have matching fields of view such that as the reflecting surface gets farther from the sensor, the light from the laser projector is getting larger and larger since it is a cone of laser light that is getting ever larger as you get farther from its source. So the cone of light that is the projector is expanding at the same rate as the lines of sight of camera’s pixels are expanding. What this means in the captured image is that dots in the projected pattern appear to move left to right with distance, not up and down. So by simply tracking a dot’s horizontal coordinate, the Kinect can tell you how far that dot is from the camera sensor, i.e. that pixel’s observed depth.

The problem for the Kinect 1.0 sensor is in the manner in which it relies on small windows in the captured image. That is, it needs to detect individual dots, and then it needs to find neighboring dots as well. And from a constellation of these dots, it can identify the exact constellation of points from the projected dot pattern. Not having a constellation of points, there is no way for the Kinect processor to uniquely identify a dot in the projected pattern. We call this an ambiguity, and it means I cannot derive a depth estimate. For computer gaming, this isn’t really a problem because body parts are large enough that my constellations fit inside the pixels forming your arm. For measuring thin objects like hair or, perhaps, a utility cord as thin as a single image pixel, this is a significant obstacle, and its the major roadblock that the Kinect 2.0 attempts to address.

The Microsoft Kinect 2.0 sensor relies upon a novel image sensor that indirectly measures the time it takes for pulses of laser light to travel from a laser projector, to a target surface, and then back to an image sensor. How is this possible? Well quite easily if you consider that, in the time it takes for one complete clock cycle in a 1 GHz processor, a pulse of light travels about 1 foot. That means that if I can build a stop watch that runs at 10 GHz, I can easily measure the round trip travel distance of a pulse of light to within 0.10 feet. If I pulse the laser and make many measurements over a short period of time, I can increase my precision to the point where LIDAR systems are available that can measure distances to within less than a centimeter at a range over one kilometer.

What the Kinect 2.0 sensor does is that it takes a pixel and divides it in half. Half of this pixel is then turned on and off really fast such that, when it is on, it is absorbing photons of laser light, and when it is off, it rejects the photons. The other half of the pixel is doing the same thing; however, its doing it 180 degrees out of phase from the first half such that, when the first half is on, its off. And when the first half is off, its on. At the same time this happening, a laser light source is also being pulsed in phase with the first pixel half such that, if the first half is on, so is the laser. And if the pixel half is off, the laser will be too.

As illustrated by the timing diagram of Fig. 4, suppose we aim the laser source directly at the camera in very close proximity, then the time it takes for the laser light to leave the laser and land on the sensor is basically 0 seconds. This is depicted in Fig. 4 by the top row of light pulses (red boxes) being in perfect alignment with the gray columns. As such, the laser light will be absorbed by the first half of all camera pixels, since these halves are turned on, and rejected by the second halves since these halves are turned off. Now suppose I move the laser source back one foot. Then the laser light will arrive 1 GHz clock cycle later than when it left the source, as depicted by the second row of laser pulses in Fig. 4. So light photons that left the laser just as it was turned on will arrive after the first halves of the camera pixels are turned on, meaning that they will be absorbed by the first halves and rejected by the second.

Figure 4: Illustration of the indirect time-of-flight method.
Print

Photons leaving the laser just as it is turned off will then arrive just after the camera pixels’ second halves are turned on, meaning they will be rejected by the first halves and absorbed by the second. That means that the total amount of light absorbed by the first halves will decrease slightly while the second halves with increase slightly. As we move the laser source even farther away from the camera sensor, more and more of the photons emitted by the laser source will arrive at the camera sensor while the second halves are turned on, meaning that the second half recordings will be larger and larger while the first half recordings will be smaller and smaller. And after several milliseconds of exposure, the two total amounts of photons recorded by the two halves are compared. As more and more total photons are recorded by the second halves of the pixels compared to the first, we can assume the round trip distance that the light traveled is larger. Now its important to have two halves record the incoming laser light because it may be that the target surface will absorb some of the laser light. If it does, then the total number of photons reflected back will not be equal to the total number that was projected. This will affect both pixel halves equally. So its the ratio of photons recorded by the two halves, not the total number recorded by either side.

At some point though, the travel distance of the laser light might be so long that laser photons will arrive so late to the sensor that they overshoot the pixels’ second halves’ on-window and arrive at the first halves’ on-window, as depicted by the third row of laser pulses in Fig. 4. This results in an ambiguity, which is resolved by increasing the time that the pixel halves are turned on, giving more time for the light to travel round trip and land inside the second halves’ on-window. Of course, it also means that it will be harder to detect small changes in travel distance since all sensors have some amount of thermal noise (i.e. free electrons floating through the semiconductor lattice) that look like light photons as well as having limited precision. So what Kinect 2.0 does it that it takes two measurements, where the first measurement is a low resolution estimate with no ambiguities in distance. The second measurement is then taken with high precision, using the first estimate to eliminate any ambiguities. Of course depending on how fast the sensor works, we can always take additional estimates with greater and greater degrees of precision.

Now while all of this time-of-flight business sounds really cool, the Kinect 2.0 is even cooler because the sensor also has built-in ambient light rejection where each pixel individually detects when that pixel is over saturated with incoming ambient light, and it then resets the pixel in the middle of an exposure. The Kinect 1.0 sensor has no means of rejecting ambient light, and as such, cannot be used in environments prone to near-infrared light sources (i.e. sunlight). In fact, the Kinect 2.0 sensor’s light rejection is one of the reasons why its original developers considered using the system in automotive applications for things like a rear-view camera.

For gaming, this process of indirectly measuring the time of flight of a laser pulse allows for each pixel to independently measure distance; whereas, Kinect 1.0 has to measure distance using neighborhoods of pixels. Kinect 1.0 could not measure distances in the spaces between laser spots. And this has an impact of depth resolution where Kinect 1.0 has been cited as having a depth resolution limit of around 1 centimeter. Kinect 2.0 is limited by the speed at which it can pulse its laser source with shorter pulses offering high degrees of depth precision, and it can pulse that laser at really short intervals. What Kinect 1.0 has that Kinect 2.0 doesn’t have is that it can rely on off-the-shelf camera sensors and can run at any frame rate; whereas, Kinect 2.0 has a very unique sensor that is very expensive to manufacture. Only a large corporation like Microsoft and only a high volume market like gaming could achieve the economies of scale needed to bring this sensor into the home at such an affordable price. Considering the technology involved, one might say absurdly low price.

Conclusions

Of course, over the next couple of months the differences between the two sensors is going to become apparent as researchers, such as myself, will be getting our hands on the new sensor and will have the opportunity to see when and where the two systems are most appropriate. At present, I’m hard at work developing machine vision systems for precision diary farming using the Kinect 1.0 sensor, but I’m doing so knowing that I’ll need the precision of the new sensor.

 

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
#define MAXPACKETSIZE 512

#include <QDebug>
#include <QWidget>
#include <QTextEdit>
#include <QProcess>
#include <QScrollArea>
#include <QVBoxLayout>
#include <QPushButton>
#include <QLineEdit>
#include <QComboBox>
#include <QSpinBox>
#include <QDialogButtonBox>
#include <QPushButton>
#include <QCoreApplication>
#include <QNetworkInterface>
#include <QList>
#include <QLabel>
#include <QUdpSocket>

class QTFTPWidget : public QWidget
{
    Q_OBJECT

public:
    explicit QTFTPWidget(QString localAddressString = QString(), int localPort = -1,
                           QString remoteAddressString = QString(), int remotePort = -1,
                           QWidget *parent = 0);
    ~QTFTPWidget();

    bool putByteArray(QString filename, QByteArray transmittingFile);
    bool getByteArray(QString filename, QByteArray *requestedFile);

    inline QString errorString() { return(lastErrorString); }

    QString remoteAddressString();
    QString localAddressString();
    int remotePortNumber();
    int localPortNumber();

public slots:
    inline void onSetLocalAddress(QString address) { localAddressSuppliedByUser=address; }
    inline void onSetLocalPort(int port) { localPortSuppliedByUser=port; }
    inline void onSetRemoteAddress(QString address) { remoteAddressSuppliedByUser=address; }
    inline void onSetRemotePort(int port) { remotePortSuppliedByUser=port; }

private:
    QString localAddressSuppliedByUser, remoteAddressSuppliedByUser;
    int localPortSuppliedByUser, remotePortSuppliedByUser;

    QString lastErrorString;
    QUdpSocket *socket;
    QComboBox *localAddressComboBox;
    QSpinBox *localPortSpinBox;
    QLineEdit *remoteAddressLineEdit;
    QSpinBox *remotePortSpinBox;

    bool bindSocket();
    QByteArray getFilePacket(QString filename);
    QByteArray putFilePacket(QString filename);
};

QTFTPWidget::QTFTPWidget(QString localAddressString, int localPort, QString remoteAddressString, int remotePort, QWidget *parent) : QWidget(parent),
                             localAddressSuppliedByUser(localAddressString), localPortSuppliedByUser(localPort),
                             remoteAddressSuppliedByUser(remoteAddressString), remotePortSuppliedByUser(remotePort)
{
    this->setWindowTitle(QString("TFTP Widget"));
    this->setLayout(new QVBoxLayout());
    this->layout()->setSpacing(1);

    QWidget *widget=new QWidget();
    widget->setLayout(new QHBoxLayout());
    widget->layout()->setContentsMargins(0, 0, 0, 0);

    localAddressComboBox = new QComboBox();
    localAddressComboBox->setSizePolicy(QSizePolicy::Expanding, QSizePolicy::Fixed);
    if (localAddressSuppliedByUser.isEmpty()){
        QList<QHostAddress> hstList = QNetworkInterface::allAddresses();
        for (int n=0; n<hstList.count(); n++){
            if (hstList.at(n).toIPv4Address()){
                QString string=hstList.at(n).toString();
                localAddressComboBox->addItem(string);
            }
        }
    } else {
        localAddressComboBox->addItem(localAddressSuppliedByUser);
        localAddressComboBox->setDisabled(true);
    }

    localPortSpinBox = new QSpinBox();
    localPortSpinBox->setFixedWidth(100);
    localPortSpinBox->setMaximum(65535);
    if (localPortSuppliedByUser > 0){
        localPortSpinBox->setMinimum(0);
        localPortSpinBox->setValue(localPortSuppliedByUser);
        localPortSpinBox->setDisabled(true);
    } else {
        localPortSpinBox->setMinimum(1024);
        localPortSpinBox->setValue(7755);
    }

    QLabel *label=new QLabel(QString("Local Address:"));
    label->setFixedWidth(120);

    widget->layout()->addWidget(label);
    widget->layout()->addWidget(localAddressComboBox);
    widget->layout()->addWidget(localPortSpinBox);
    this->layout()->addWidget(widget);

    widget=new QWidget();
    widget->setLayout(new QHBoxLayout());
    widget->layout()->setContentsMargins(0, 0, 0, 0);

    remoteAddressLineEdit = new QLineEdit(remoteAddressSuppliedByUser);
    remoteAddressLineEdit->setSizePolicy(QSizePolicy::Expanding, QSizePolicy::Fixed);
    if (!remoteAddressSuppliedByUser.isEmpty()){
        remoteAddressLineEdit->setReadOnly(true);
    }

    remotePortSpinBox = new QSpinBox();
    remotePortSpinBox->setFixedWidth(100);
    remotePortSpinBox->setMaximum(65535);
    if (remotePortSuppliedByUser > 0){
        remotePortSpinBox->setMinimum(0);
        remotePortSpinBox->setValue(remotePortSuppliedByUser);
        remotePortSpinBox->setDisabled(true);
    } else {
        remotePortSpinBox->setMinimum(1024);
        remotePortSpinBox->setValue(7755);
    }

    label=new QLabel(QString("Remote Address:"));
    label->setFixedWidth(120);

    widget->layout()->addWidget(label);
    widget->layout()->addWidget(remoteAddressLineEdit);
    widget->layout()->addWidget(remotePortSpinBox);
    this->layout()->addWidget(widget);

    widget=new QWidget();
    widget->setLayout(new QHBoxLayout());
    widget->layout()->setContentsMargins(0, 0, 0, 0);

    this->layout()->addWidget(widget);

    socket=NULL;
}

QTFTPWidget::~QTFTPWidget()
{
    if (socket) delete socket;
}

bool QTFTPWidget::bindSocket()
{
    // SEE IF WE ALREADY HAVE A BOUND SOCKET
    // AND IF SO, WE NEED TO DELETE IT
    if (socket != NULL) {
        delete socket;
    }

    // CREATE A NEW SOCKET
    socket=new QUdpSocket();

    // AND SEE IF WE CAN BIND IT TO A LOCAL IP ADDRESS AND PORT
    if (localAddressSuppliedByUser.isEmpty()){
        return(socket->bind(QHostAddress(localAddressComboBox->currentText()), localPortSpinBox->value()));
    } else {
        return(socket->bind(QHostAddress(localAddressSuppliedByUser), localPortSuppliedByUser));
    }
}

QString QTFTPWidget::remoteAddressString()
{
    return(remoteAddressLineEdit->text());
}

QString QTFTPWidget::localAddressString()
{
    return(localAddressComboBox->currentText());
}

int QTFTPWidget::remotePortNumber()
{
    return(remotePortSpinBox->value());
}

int QTFTPWidget::localPortNumber()
{
    return(localPortSpinBox->value());
}

bool QTFTPWidget::putByteArray(QString filename, QByteArray transmittingFile)
{
    // BIND OUR LOCAL SOCKET TO AN IP ADDRESS AND PORT
    if (!bindSocket()) {
        lastErrorString = socket->errorString();
        return(false);
    }

    // MAKE A LOCAL COPY OF THE REMOTE HOST ADDRESS AND PORT NUMBER
    QHostAddress hostAddress(QHostAddress(remoteAddressLineEdit->text()));
    int portNumber = remotePortSpinBox->value();

    // CREATE REQUEST PACKET AND SEND TO HOST
    // WAIT UNTIL MESSAGE HAS BEEN SENT, QUIT IF TIMEOUT IS REACHED
    QByteArray reqPacket=putFilePacket(filename);
    if (socket->writeDatagram(reqPacket, hostAddress, portNumber) != reqPacket.length()){
        lastErrorString = QString("did not send packet to host :( %1").arg(socket->errorString());
        return(false);
    }

    // CREATE PACKET COUNTERS TO KEEP TRACK OF MESSAGES
    unsigned short incomingPacketNumber=0;
    unsigned short outgoingPacketNumber=0;

    // NOW WAIT HERE FOR INCOMING DATA
    bool messageCompleteFlag=false;
    while (1){
        // WAIT FOR AN INCOMING PACKET
        if (socket->hasPendingDatagrams() || socket->waitForReadyRead(10000)){
            // ITERATE HERE AS LONG AS THERE IS ATLEAST A
            // PACKET HEADER'S WORTH OF DATA TO READ
            QByteArray incomingDatagram;
            incomingDatagram.resize(socket->pendingDatagramSize());
            socket->readDatagram(incomingDatagram.data(), incomingDatagram.length());

            // MAKE SURE FIRST BYTE IS 0
            char *buffer=incomingDatagram.data();
            char zeroByte=buffer[0];
            if (zeroByte != 0x00) {
                lastErrorString = QString("Incoming packet has invalid first byte (%1).").arg((int)zeroByte);
                return(false);
            }

            // READ UNSIGNED SHORT PACKET NUMBER USING LITTLE ENDIAN FORMAT
            // FOR THE INCOMING UNSIGNED SHORT VALUE BUT BIG ENDIAN FOR THE
            // INCOMING DATA PACKET
            unsigned short incomingMessageCounter;
            *((char*)&incomingMessageCounter+1)=buffer[2];
            *((char*)&incomingMessageCounter+0)=buffer[3];

            // CHECK INCOMING MESSAGE ID NUMBER AND MAKE SURE IT MATCHES
            // WHAT WE ARE EXPECTING, OTHERWISE WE'VE LOST OR GAINED A PACKET
            if (incomingMessageCounter == incomingPacketNumber){
                incomingPacketNumber++;
            } else {
                lastErrorString = QString("error on incoming packet number %1 vs expected %2").arg(incomingMessageCounter).arg(incomingPacketNumber);
                return(false);
            }

            // CHECK THE OPCODE FOR ANY ERROR CONDITIONS
            char opCode=buffer[1];
            if (opCode != 0x04) { /* ack packet should have code 4 and should be ack+1 the packet we just sent */
                lastErrorString = QString("Incoming packet returned invalid operation code (%1).").arg((int)opCode);
                return(false);
            } else {
                // SEE IF WE NEED TO SEND ANYMORE DATA PACKETS BY CHECKING END OF MESSAGE FLAG
                if (messageCompleteFlag) break;

                // SEND NEXT DATA PACKET TO HOST
                QByteArray transmitByteArray;
                transmitByteArray.append((char)0x00);
                transmitByteArray.append((char)0x03); // send data opcode
                transmitByteArray.append(*((char*)&outgoingPacketNumber+1));
                transmitByteArray.append(*((char*)&outgoingPacketNumber));

                // APPEND DATA THAT WE WANT TO SEND
                int numBytesAlreadySent=outgoingPacketNumber*MAXPACKETSIZE;
                int bytesLeftToSend=transmittingFile.length()-numBytesAlreadySent;
                if (bytesLeftToSend < MAXPACKETSIZE){
                    messageCompleteFlag=true;
                    if (bytesLeftToSend > 0){
                        transmitByteArray.append((transmittingFile.data()+numBytesAlreadySent), bytesLeftToSend);
                    }
                } else {
                    transmitByteArray.append((transmittingFile.data()+numBytesAlreadySent), MAXPACKETSIZE);
                }

                // SEND THE PACKET AND MAKE SURE IT GETS SENT
                if (socket->writeDatagram(transmitByteArray, hostAddress, portNumber) != transmitByteArray.length()){
                    lastErrorString = QString("did not send data packet to host :( %1").arg(socket->errorString());
                    return(false);
                }

                // NOW THAT WE'VE SENT AN ACK SIGNAL, INCREMENT SENT MESSAGE COUNTER
                outgoingPacketNumber++;
            }
        } else {
            lastErrorString = QString("No message received from host :( %1").arg(socket->errorString());
            return(false);
        }
    }
    lastErrorString = QString("no error");
    return(true);

}

bool QTFTPWidget::getByteArray(QString filename, QByteArray *requestedFile)
{
    // BIND OUR LOCAL SOCKET TO AN IP ADDRESS AND PORT
    if (!bindSocket()) {
        lastErrorString = socket->errorString();
        return(false);
    }

    // MAKE A LOCAL COPY OF THE REMOTE HOST ADDRESS AND PORT NUMBER
    QHostAddress hostAddress(QHostAddress(remoteAddressLineEdit->text()));
    int portNumber = remotePortSpinBox->value();

    // CLEAN OUT ANY INCOMING PACKETS
    while (socket->hasPendingDatagrams()){
        QByteArray byteArray;
        byteArray.resize(socket->pendingDatagramSize());
        socket->readDatagram(byteArray.data(), byteArray.length());
    }

    // CREATE REQUEST PACKET AND SEND TO HOST
    // WAIT UNTIL MESSAGE HAS BEEN SENT, QUIT IF TIMEOUT IS REACHED
    QByteArray reqPacket=getFilePacket(filename);
    if (socket->writeDatagram(reqPacket, hostAddress, portNumber) != reqPacket.length()){
        lastErrorString =  QString("did not send packet to host :( %1").arg(socket->errorString());
        return(false);
    }

    // CREATE PACKET COUNTERS TO KEEP TRACK OF MESSAGES
    unsigned short incomingPacketNumber=1;
    unsigned short outgoingPacketNumber=1;

    // NOW WAIT HERE FOR INCOMING DATA
    bool messageCompleteFlag=false;
    while (!messageCompleteFlag){
        // WAIT FOR AN INCOMING PACKET
        if (socket->hasPendingDatagrams() || socket->waitForReadyRead(10000)){
            // ITERATE HERE AS LONG AS THERE IS ATLEAST A
            // PACKET HEADER'S WORTH OF DATA TO READ
            QByteArray incomingDatagram;
            incomingDatagram.resize(socket->pendingDatagramSize());
            socket->readDatagram(incomingDatagram.data(), incomingDatagram.length());

            // MAKE SURE FIRST BYTE IS 0
            char *buffer=incomingDatagram.data();
            char zeroByte=buffer[0];
            if (zeroByte != 0x00) {
                lastErrorString = QString("Incoming packet has invalid first byte (%1).").arg((int)zeroByte);
                return(false);
            }

            // READ UNSIGNED SHORT PACKET NUMBER USING LITTLE ENDIAN FORMAT
            // FOR THE INCOMING UNSIGNED SHORT VALUE BUT BIG ENDIAN FOR THE
            // INCOMING DATA PACKET
            unsigned short incomingMessageCounter;
            *((char*)&incomingMessageCounter+1)=buffer[2];
            *((char*)&incomingMessageCounter+0)=buffer[3];

            // CHECK INCOMING MESSAGE ID NUMBER AND MAKE SURE IT MATCHES
            // WHAT WE ARE EXPECTING, OTHERWISE WE'VE LOST OR GAINED A PACKET
            if (incomingMessageCounter == incomingPacketNumber){
                incomingPacketNumber++;
            } else {
                lastErrorString = QString("error on incoming packet number %1 vs expected %2").arg(incomingMessageCounter).arg(incomingPacketNumber);
                return(false);
            }

            // COPY THE INCOMING FILE DATA
            QByteArray incomingByteArray(&buffer[4], incomingDatagram.length()-4);

            // SEE IF WE RECEIVED A COMPLETE 512 BYTES AND IF SO,
            // THEN THERE IS MORE INFORMATION ON THE WAY
            // OTHERWISE, WE'VE REACHED THE END OF THE RECEIVING FILE
            if (incomingByteArray.length() < MAXPACKETSIZE){
                messageCompleteFlag=true;
            }

            // APPEND THE INCOMING DATA TO OUR COMPLETE FILE
            requestedFile->append(incomingByteArray);

            // CHECK THE OPCODE FOR ANY ERROR CONDITIONS
            char opCode=buffer[1];
            if (opCode != 0x03) { /* ack packet should have code 3 (data) and should be ack+1 the packet we just sent */
                lastErrorString = QString("Incoming packet returned invalid operation code (%1).").arg((int)opCode);
                return(false);
            } else {
                // SEND PACKET ACKNOWLEDGEMENT BACK TO HOST REFLECTING THE INCOMING PACKET NUMBER
                QByteArray ackByteArray;
                ackByteArray.append((char)0x00);
                ackByteArray.append((char)0x04);
                ackByteArray.append(*((char*)&incomingMessageCounter+1));
                ackByteArray.append(*((char*)&incomingMessageCounter));

                // SEND THE PACKET AND MAKE SURE IT GETS SENT
                if (socket->writeDatagram(ackByteArray, hostAddress, portNumber) != ackByteArray.length()){
                    lastErrorString = QString("did not send ack packet to host :( %1").arg(socket->errorString());
                    return(false);
                }

                // NOW THAT WE'VE SENT AN ACK SIGNAL, INCREMENT SENT MESSAGE COUNTER
                outgoingPacketNumber++;
            }
        } else {
            lastErrorString = QString("No message received from host :( %1").arg(socket->errorString());
            return(false);
        }
    }
    return(true);
}

QByteArray QTFTPWidget::getFilePacket(QString filename)
{
    QByteArray byteArray(filename.toLatin1());
    byteArray.prepend((char)0x01); // OPCODE
    byteArray.prepend((char)0x00);
    byteArray.append((char)0x00);
    byteArray.append(QString("octet").toLatin1()); // MODE
    byteArray.append((char)0x00);

    return(byteArray);
}

QByteArray QTFTPWidget::putFilePacket(QString filename)
{
    QByteArray byteArray((char)0x00);
    byteArray.append((char)0x02); // OPCODE
    byteArray.append(filename.toLatin1());
    byteArray.append((char)0x00);
    byteArray.append(QString("octet").toLatin1()); // MODE
    byteArray.append((char)0x00);

    return(byteArray);
}