Learning and recognizing three-dimensional shapes by a neural network using solid angles

© by the Authors. Licensee AccScience Publishing, USA. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution -Noncommercial 4.0 International License (CC BY-NC 4.0) ( https://creativecommons.org/licenses/by-nc/4.0/ )

Download PDF

Cite

XML

HTML

Abstract

Three-dimensional (3D) shapes differ from two-dimensional (2D) shapes in terms of the amount of data that cat be acquired for each shape. In addition, the information that can be obtained from a 3D shape varies greatly depending on the viewing angle and posture, and there is currently no universal countermeasure for this problem. Therefore, it is difficult to acquire the level of features necessary for machine learning. To learn and recognize 3D shapes, learning approaches using images from various angles, techniques using normal vectors, and approaches based on the acquisition of the overall structure via voxelization have been studied thus far. However, these methods are not always effective because they complicate the preprocessing of data required for learning. In this paper, we propose a method using solid angles as a new quantitative feature for learning and recognition. The solid angle is a 3D angle corresponding to the plane angle of a 2D shape; when a point is fixed, a constant value can be obtained regardless of the posture of the object. In addition, although the calculations required to obtain this value are intensive and time consuming, they can be performed in a relatively simple manner. In this study, primitive shapes are learned and recognized using solid angles as a quantitative feature. As a result, we demonstrate that after learning using a neural network, this method can appropriately recognize a given shape.

Keywords

Neural networks

Shape recognition

Shape registration

Solid angle

References

Ahmed, E., Saint, A., Shabayek, A. E. R., Cherenkova, K., Das, R., Gusev, G., Aouada, D., & Ottersten, B. (2019). A survey on deep learning advances on different 3D data representations, arXiv pre-print arXiv:1808.01462.
AI-SCHOLAR. (2018). Available online at: https://ai-scholar.tech/articles/treatise/deeplearning-3d-36 (accessed on November 12, 2020)
Arecchi, V. A., Messadi, T., & Koshel, J. R. (2007). Field Guide to Illumination (Field Guide Series), Society of Photo Optical.
Cohen, T. S., Geiger, M., Köehler, J., & Welling, M. (2018). Spherical CNNs, Proceedings of the 6th International Conference on Learning Representations (ICLR), arXiv preprint arXiv:1801.10130.
CORE CONCEPT TECHNOLOGIES INC., Available online at: https://recruit.cct-inc.co.jp/tec-blog/com-geometry/3d-cad-fr/ (accessed on November 22, 2020)
Cybenko, G. (1989). Approximation by superpositions of a sigmoidal function, Mathematics of Control, Signals, and Systems, 2, 303-314. https://doi.org/10.1007/bf02551274.
Deng, H., Birdal, T., & Ilic, S. (2018). PPFNet: Global context aware local features for robust 3D point matching, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, arXiv pre-print arXiv:1802.02669.
Eleliemy, A., Mohammed, A., & Ciorba, F. M. (2017). Efficient generation of parallel spin-images using dynamic loop scheduling, IEEE 19th International Conference on High Performance Computing and Communications Workshops (HPCCWS). https://doi.org/10.1109/hpccws.2017.00012.
Fang, H. (2019). Geometric modeling of man-made objects at different level of details. Université Côte d'Azur. Available online at: https://tel.archives-ouvertes.fr/tel-02406834/file/2019AZUR4002.pdf (accessed on December 23, 2020).
Fang, Y., Xie, J., Dai, G., Wang, M., Zhu, F., Xu, T., & Wong, E. (2015). 3D deep shape descriptor, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), https://doi.org/10.1109/CVPR.2015.7298845.
Feng, Y., Feng, Y., You, H., Zhao, X., & Gao, Y. (2019). MeshNet: Mesh neural network for 3D shape representation, Proceedings of the AAAI Conference on Artificial Intelligence, 33(01). 33(1). https://doi.org/10.1609/aaai.v33i01.33018279.
Fukushima, K. (2013). Artificial vision by multi-layered neural networks: Neocognitron and its advances, (2013)37. 103-119. https://doi.org/10.1016/j.neunet.2012.09.016.
Funahashi, K. (1989). On the approximate realization of continuous mappings by neural networks, Neural Networks, 2(3), 183-192. https://doi.org/10.1016/0893-6080(89)90003-8.
Hachiuma, R., Ozasa, Y., & Saito, H. (2017). Primitive Shape Recognition via Superquadric representation using large margin nearest neighbor classifier. In Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2017), 5, 325-332.
Hinton, G. E., & Salakhutdinov, R. R. (2006). Reducing the dimensionality of data with neural networks, Science, 313(5786). 504-507. https://doi.org/10.1126/science.1127647.
Johnson, A. & Hebert, M. (1999). Using Spin Images for Efficient Object recognition in cluttered 3D scenes, IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(5), 433-449. https://doi.org/10.1109/34.765655.
Johnson, A. E. (1997). Spin-Images: A representation for 3-D surface matching, Ph. D Thesis, CMU-RI-TR-97-47, Robotics Institute, Carnegie Mellon University, Available online at: https://www.ri.cmu.edu/pub_files/pub2/johnson_andrew_1997_3/johnson_andrew_1997_3.pdf (accessed on October 27, 2020)
Jurafsky, D., & Martin, J. H. (2020). Speech and language processing (Third edition draft), Available online at: https://web.stanford.edu/~jurafsky/slp3/ed3book.pdf. (accessed on December 28, 2020).
Kalchbrenner, N., Espeholt, L., Simonyan, K., Oord, A., Graves, A., & Kavukcuoglu, K. (2016). Neural machine translation in linear time, arXiv preprint arXiv:1610.10099.
Kanezaki, A., Matsushita, Y., & Nishida, Y. (2018). RotationNet: Joint object categorization and pose estimation using multiviews from unsupervised viewpoints, Computer Vision and Pattern Recognition, arXiv preprint arXiv:1603.06208.
Kato, Y., Tanaka, S., Kanamori, Y., & Mitani, J. (2019). Single-View Modeling of Layered Origami with Plausible Outer Shape, Computer Graphics Forum, 38(7), 629–640, https://doi.org/10.1111/cgf.13866.
Kawakatsu, D., Nakayama, A., Kobori, K., & Kutsuwa, T. (1993). A method of selecting an object in virtual reality, IPSJ SIG on CGVI. 110, 25-32. (in Japanese)
Kodama, S. (2018). Effectiveness of inside/outside determination in relation to 3d non-convex shapes using cuda, The Imaging Science Journal, 66(7), 409-418. https://doi.org/10.1080/13682199.2018.1497251.
Kodama, S., Nakagawa, R., & Ozeki, Y. (2019). Rapid determination of three-dimensional convex shapes by dispersion processing using java RMI, International Journal of Computer Science and Engineering, 6(11), 18–27. https://doi.org/10.14445/23488387/ijcse-v6i11p105.
Le, T., & Duan, Y. (2018). PointGrid: A deep network for 3D shape understanding, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 9204-9214, https://doi.org/10.1109/cvpr.2018.00959.
Lui, M., Yao, F., Choi, C., Sinha, A., & Ramni, K. (2019). Deep learning 3D shapes using alt-az anisotropic 2-sphere convolution, Proceedings of the 7th International Conference on Learning Representations.
Mescheder, L., Oechsle, M., Niemeyer M., Nowozin, S., & Geiger, A. (2018). Occupancy networks: Learning 3D reconstruction in function space, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 4455-4465. https://doi.org/10.1109/CVPR.2019.00459.
Mileff, P., & Dudra, J. (2019). Simplified voxel based visualization. Production Systems and Information Engineering, 8, 5-18. https://doi.org/10.32968/psaie.2019.001.
Nakayama, A., Kawakatsu, K., Kobori, K., & Kutsuwa T. (1994). A checking method for a point inside a polyhedron in grasping an object of VR, Proceedings of the 48th National Convention of IPSJ, 297-298. (in Japanese)
Odaka, T. (2016). Machine learning and deep learning, Tōkyō: Ōmusha. (in Japanese)
Oord, A., Dieleman, S., Zen, H., Simonyan, K., Vinyals, O., Graves, A., Kalchbrenner, N., Senior, A., & Kavukcuoglu, K. (2016). WaveNet: A generative model for raw audio, arXiv preprint arXiv:1609.03499.
Research Institute for Computational Science. (2019). Available online at: https://www.ricos.co.jp/tech-0101_ml_fem_geometry/ (accessed on October 25, 2020).
Sedaghat, N., Zolfaghari, M., Amiri, E., & Brox, T. (2017). Orientation-boosted Voxel Nets for 3D Object Recognition, Procedings of the British Machine Vision Conference 2017, https://doi.org/10.5244/c.31.97.
Shi, Z., Lin, Y., & Li, H. (2020). Extraction of urban power lines and potential hazard analysis from mobile laser scanning point clouds, International Journal of Remote Sensing, 41(9), https://doi.org/10.1080/01431161.2019.1701726.
Su, H., Maji, S., Kalogerakis, E., & Learned-Miller, E. (2015). Multi-view convolutional neural networks for 3D shape recognition, 2015 IEEE International Conference on Computer Vision (ICCV). https://doi.org/10.1109/iccv.2015.114.
Takahashi, K., Numajiri, T., Sogabe, M., Sakamoto, K., Yamaguchi, K., Yokogawa, S., & Sogabe, T. (2018). Development of general-purpose CNN deep layer learning method using feature graph, The 32nd Annual Conference of the Japanese Society for Artificial Intelligence, 4Pin1-01, 1-4. (in Japanese)
Varma, M. J., Kx, V. K., Thorannath, K. V., Harshavardhan, C. A, & Ahmed, M. R. (2020). 3D reconstruction of 2d images using deep learning on the nvidia jetson nano, International Journal of Advanced Science and Technology, 29(10s), 7681-7686. Available online at: http://sersc.org/journals/index.php/IJAST/article/view/24090. (accessed on November 25, 2020).
Wang, Y., Pan, G., Wu, Z., & Han, S. (2004). Sphere-Spin-Image: A Viewpoint-Invariant surface representation for 3D face recognition, Computational Science, ICCS 2004 Lecture Notes in Computer Science, 427-434, https://doi.org/10.1007/978-3-540-24687-9_54.
Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., & Xiao, J. (2015). 3D ShapeNets: A deep representation for volumetric shapes, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), https://doi.org/10.1109/cvpr.2015.7298801.
You, H., Feng, Y., Ji, R., & Gao, Y. (2018). PVNet: A joint convolutional network of point cloud and multi-view for 3D shape recognition, MM '18: Proceedings of the 26th ACM international conference on Multimedia, 1310-1318.

Previous article in this issue

Next article in this issue

International Journal of Systematic Innovation, Electronic ISSN: 2077-8767 Print ISSN: 2077-7973, Published by AccScience Publishing

Publisher's Core Philosophy

We are committed to support the scientific community by publishing impactful research and enhancing communication among scientists. At AccScience Publishing, we are continuously looking for ways to accelerate scientific progress and to strive for transparency and open communication, making knowledge freely accessible without barrier.

9 Raffles Place, Republic Plaza 1 #06-00 Singapore 048619

+65 8182 1586

editorial@accscience.com