page created : August 20th, 2008      last update : January 11th, 2009 (link to Nic_2)  authors : Claude Baumann/Laurent Kneip

Robot name: Nic_3

Abstract: This LEGO™ Mindstorms® robot Nic_3 uses a NXT and an RCX brick that are linked through the HiTechnic IR-link module. The 3 DOFs robot samples 600 data-points of a continous sound source on each channel of the stereo audio sensor. The sampling frequency is 36kHz. Then it performs the cross-correlation and determines the time-lag between both signals. Applying a novel algorithm that is based on time difference of arrival measurements and head movements (Kneip/Baumann 2008), the robot is able to localize a sound source in space with an accuracy of 20° both in azimuth and elevation. (Notice that this is worse than the accuracy of the robot Nic_2 that has served in the experimental part of the JASA article. However Nic_2 is more difficult to manipulate.)
Photo:

Previous work:
2D sound localizing based on interaural level differences
Sensor based on the precedence effect
3D precedence effect sensor
Detail of a robot using the previous sensor
binaural sensor
2D sound localizing based on interaural time differences (Nic_1)
Article: C. BAUMANN, L. KNEIP, Stereo robot ears, Elektor, July/August 2007, p. 13-17 (presents a digital ITD sensor)
L. KNEIP, C. BAUMANN, Binaural model for spatial sound localization based on interaural time delay cues ..., unpublished work that comprises the 3 DOFs LEGO™ RCX based robot Nic_2, Feb. 2007 (Notice: With 10° Nic_2 has better accuracy than Nic_3, but it is more difficult to manipulate.)
Article: L. KNEIP, C. BAUMANN, Binaural model for artificial spatial sound localization based on interaural time delays and movements of the interaural axis, The Journal of the Acoustical Society of America -- November 2008 -- Volume 124, Issue 5, pp. 3108-3119 (link)
Phase detection
Bill of material:
LEGO™ NXT: motor control, connection to LEGO™ RCX
LEGO™ RCX: 2 channel ADC (36kHz), cross-correlation, Kneip/Baumann algorithm
HiTechnic IR-link (invisible on the photo)
3 LEGO™ NXT servo-motors
RCX compatible 2-channel audio amplifier
RCX compatible laser pointer
About 100 standard LEGO™ pieces
Function summary:
    • A small music playing mono-radio, representing an immobile audio source, is emitting sound waves that arrive at the microphones with different delays depending on its position in the robot head referential. 
    • The stereo amplifier enhances the signals in order to make them measurable by the RCX 10bit analog-to-digital converters (ADCs).
    • The RCX is programmed to sample 600 data values on each channel. UR has a high speed sampling feature. If two ports are read, the sampling frequency is 36kHz.)
    • The RCX cross-correlates both signals and yields the time difference of arrival (TDOA, also denoted interaural time difference, ITD). The program uses a zero-padding variant of the sums of absolute differences method in the time domain that normally is applied in image processing concerning template matching. (Notice: It can easily be proved that the sums of absolute differences function has many similarities with the cross-correlation function. For instance, the extrema have the same arguments and thus phase shifts are identical.)
    • The robot rotates the head about the y-axis and repeats the TDOA measurement. Applying the Kneip/Baumann 2008 algorithm, which essentially is a mathematically exact determination of the spatial direction of sound, the robot calculates the azimuth and the elevation of the sound vector.
    • The NXT continuously polls the RCX variables that contain the information of the absolute angular target position (containers Red, Blue and Yellow). For this purpose the NXT uses the HiTechnic IR-link. (Notice: The IR-link can only poll RCX sensor-values by sending the RCX opcode 0x12; RCX variables cannot be read with the normal RCX firmware. But, since the RCX is programmed using the ULTIMATE ROBOLAB® environment, the opcode handler has been changed to respond to the polling by sending the listed variable values.)
    • From the measured/computed TDOAs the RCX deduces the azimuth and the elevation using the novel Kneip/Baumann method.
    • Now the robot rotates the head in order to have the laser point into the direction of sound. The laser pointer is powered from the RCX.
    • Program flow-chart:

Video:

  • Note 1: The video sometimes is not available on youtube. Please try the high resolution version at http://www.youtube.com/watch?v=SfwjKsqwPk8
  • Note 2: The Kneip/Baumann 2008 method can also be applied to rotations about the z-axis with the same result.
  • Note 3: Nic_3 needs plenty of time for each cue. With enhanced electronics based on a DSP module for instance, a comparable robot could localize sound in real-time while moving around.
  • Note 4: There are serious limits to the accuracy because of the mechanical limitations of the LEGO material. While conceiving Nic_2, the predecessor of the described robot, Laurent Kneip had the excellent idea of applying some torsion to each axis, by twisting the axles with a certain torque with the result that the backlash in the gearings is reduced. However this astute technique has not been applied to Nic_3.
Software:
The RCX is programmed using the ULTIMATE ROBOLAB® (UR) environment. Each UR-program is a real firmware for the RCX. Therefore it is possible to graphically program the device at lowest level. This unlocks unexpected features of the LEGO™ brick, such as high speed ADC sampling. See the RCX program flow-chart.
The NXT is programmed using the LabVIEW® NXT toolkit. See the NXT program, the motor-control sub.vi and the coerc sub.vi.
The sub.vis for the HiTechnic IR-link have been gratefully received from the producer. However the main VI has two important bugs that are fixed in the described programs: the default value for the "Connection" should be 1 not 0 (if the default value is being used, the IR-link tries to connect to NXT port -255, which obviously does not exist); the "RCX read" sub.vi is not connected to the port wire (only NXT port 1 actually works here.)
Notes:
    • The current UR-program does not use Blauert's degree of coherence in order to prune bad cues. No error analysis has been applied either.
    • The mechanical errors are about 7° for the x- and z- rotations and 3° for the y-rotations.
    • Accuracy for TDOA estimations: ~30ms.
    • The tests have been made in an ordinary reverberant room of 40m3 volume.
    • The distance to the sound source was 2m.
    • The HiTechnic IR-link sometimes reads bad values !!
    • Actual NXT program : NXT sswitches off after 15 minutes. The power-down function is not disabled yet.
    • We are planning a ROBOLAB 294 version for the NXT.
Literature:
HOME