Meet the Team
From left to right:
Yuanxin Zhong | Mechanical Engineering, Automotive |
Martin Deegan | Computer Science and Mathematics |
Kaustav Chakraborty | Robotics |
Purva Kulkarni | Electrical and Computer Engineering, Embedded Systems |
Christine Searle | Robotics |
ORB-SLAM
ORB-SLAM is an open source implementation of pose landmark graph SLAM. It supports monocular, stereo, and RGBD camera input through the OpenCV library.
Our multi-agent system is an enhancement of the second generation of ORB-SLAM, ORB-SLAM2.
Diagram of the ORB-SLAM2 implementation from Mur-Artal and Tardos' 2017 paper, "ORB-SLAM2: an Open-Source SLAM System for Monocular, Stereo and RGB-D Cameras".
Multi-Agent ORB-SLAM
Two are better than one, because they have a good return for their labor: if either of them falls down, one can help the other up.
There are two major benefits to a multi-agent SLAM system:
- Two robots exploring can cover the same space in half the time.
- If the two robots can identify each other, they can use recognition of the other robot as an additional opportunity for loop closure. We make use of Ed Olson's popular April tag system for robot identification.
However, a multi-agent system requires a mechanism for combining map data from the ORB-SLAM implementation running on each robot. While future implementations may utilize a client-server architecture to do this fusion, here we simply create a separate server thread with direct access to the client threads.
Datasets
In phase one, initial testing was done on the stereo KITTI dataset taken in Karlsruhe, Germany. To simulate two simultaneously running clients, we split the grayscale 00 stereo portion of the KITTI dataset in half, adjusting the timestamps on the second half to align with the first half.
In phase two, we created a custom dataset with April tagged "robots" on the second floor of the EECS building at the University of Michigan.
Modifications to ORB-SLAM
To simulate running two clients, we ran two simultaneous instances of ORB-SLAM, each with an adjustable sized portion of the 00 stereo portion of the KITTI dataset. Each client instance of ORB-SLAM spawns three threads: tracking, mapping, and loop closing. We added a fourth thread to simulate a server merging data from the two client instances. This fourth server detected and performed loop closures on the combined data of the two clients, creating a larger combined map of the environment.
The complete code for our implementation of multi-agent ORB-SLAM can be found here on Github.
Results
Phase One: KITTI Dataset
This image shows a successful loop closure point, with the two viewings forming the loop closure coming from different data tracks ("clients").
This is a larger portion of the server thread generated map.
Phase Two: Custom Dataset
The image on the left shows an intersection from the KITTI dataset which, because the agents come from opposite directions, cannot be used as a loop closure point. In contrast, our April-tag enhanced system, due to agent recognition, can use this type of intersection as a loop closure point. The image on the right shows both within-client and between-client loop closures.
Blue lines indicate keyframes; green lines indicate loop closures within a client; red lines indicate loop closures from one client to the other.
See it in Action
Future Work
Real-time SLAM
This implementation gathers all data from both clients before merging the two maps into one. While this is acceptable for many applications, it is typically more useful to build the server map incrementally at the same time as the client maps.
True Client-Server Separation
Running client threads and a server thread to mimic the separation of physical agents and a centralized server ignores several aspects of difficulty. Physical separation places limitations on data sharing between the clients and server, largely in how much data can be sent within a given timestep. This introduction of communication protocols raises the potential for communication failures. A robust multi-agent system requires a way to compensate for an unexpected loss of communication.
Leveraging the Enhanced Server Map
The current implementation only builds the server map after all client data has been collected. An incremental multi-agent system, in addition to building the server map in real time, should send data back to the clients to improve their maps. In exploration situations, client knowledge of the larger picture is critical for deciding what action to take next.
Moving Past April Tags
While extremely useful, as proven by their widespread adoption, April tags are an artificial handicap to robot recognition. A more elegant solution, a solution better able to blend in to the natural world could use learned features of the robotics agents in the system to recognize the agents instead.
References
- Gálvez-López, Dorian, and Juan D. Tardos. "Bags of binary words for fast place recognition in image sequences." IEEE Transactions on Robotics 28.5 (2012): 1188-1197.
- Mur-Artal, Raul, Jose Maria Martinez Montiel, and Juan D. Tardos. "ORB-SLAM: a versatile and accurate monocular SLAM system." IEEE transactions on robotics 31.5 (2015): 1147-1163.
- Mur-Artal, Raul, and Juan D. Tardós. "Orb-slam2: An open-source slam system for monocular, stereo, and rgb-d cameras." IEEE Transactions on Robotics 33.5 (2017): 1255-1262.
- Rublee, Ethan, et al. "ORB: An efficient alternative to SIFT or SURF." (2011): 2564-2571.
- Olson, Edwin. "AprilTag: A robust and flexible visual fiducial system." 2011 IEEE International Conference on Robotics and Automation. IEEE, 2011.