top
For humans, up to70% to 80% of information is obtained through visual means. Similarly, in the field of artificial intelligence, visual AI technology is considered one of the most promising technologies for application at present. It endows machines with the ability to recognize people and everything, enabling them to perceive and understand the world, thereby greatly improving information processing efficiency in production and work.
According to reports from multiple market research institutions, in recent years, the global visual industry hasThe sales of the AI market continue to grow. Specifically, the sales revenue in 2022 has reached $11.351 billion and is expected to grow to $21.81 billion in 2029, with a compound annual growth rate (CAGR) of 10.51%.
In China, visualThe AI market has also shown strong growth momentum. In 2023, the market size of computer vision in China (as a key component of visual AI) has reached 57.19 billion yuan, a year-on-year increase of 20.2%. This data not only highlights the huge scale of China's visual artificial intelligence market, but also indicates that its growth rate far exceeds many other industries.
In this issueIn the "Ningdian Interview", we are honored to invite Mr. Zhu Caizhi, Chairman of InterLingda Information Technology (Shenzhen) Co., Ltd., to share his profound insights on the current challenges and future development trends of AI vision technology.
Zhu Caizhi, fromSince 2000, he began to engage in research work related to image processing and computer vision. He first had industrial experience in the Microsoft Research Asia and Ricoh Corporation of Japan, and then had academic experience in the Japanese National Research Institute and Nagoya University from postdoctoral, assistant professor to associate professor. During this period, he won the annual world champion of NIST vision algorithm for three times, and published more than 30 international academic papers and more than 30 patents.
After returning to China, Zhu Caizhi first joined the visual team led by Professor Tang Xiaoou (founder of SenseTime) at the Shenzhen Advanced Technology Research Institute of the Chinese Academy of Sciences. One year later, he started a full-time business and served as the Chairman and General Manager of InterLingda Company. He was once awarded the title ofGAISC Award 2018 AI TOP10 Pioneer Figures.
AsChinese VisualHow do you define the first batch of AI entrepreneursvisionThe concept of AI?
Zhu Caizhi: Nowadays, when people mentionWhen it comes to AI, it often narrowly associates with large models or deep learning. But in reality, the scope of AI goes far beyond that. For example, chatbots utilize AI's ability in text processing, but visual information is more important in modern society, with up to 80% of information coming from vision. Before the popularity of deep learning and big models, AI had already covered various technologies and theories such as statistical machine learning theory and symbolism, which also constituted an important part of visual research.
visionAI is a science that studies how to make machines "visible" and "understandable" by using cameras and computers to replace human eyes in recognizing, tracking, measuring, and analyzing images and videos, thereby achieving recognition, detection, and tracking of objects in the real world. It trains computers to replicate the human visual system, enabling digital devices to recognize and process objects in images and videos like humans.
visionThe application fields of AI are extremely extensive, not limited to vision itself, but also involving multiple aspects such as speech and text understanding. The essence of visual AI is to replace some functions of the human eye and brain. For example, in facial recognition access control systems, AI has been able to accurately identify and verify identities, replacing the tedious identity verification work of security personnel.
On industrial production lines,AI improves production efficiency and product quality by detecting defects in products or accessories, while reducing labor intensity for workers. In addition, in fields such as banks and insurance companies, AI can automatically recognize and input bill information, greatly improving work efficiency.
Proposed by InterLingdaHow should we understand the relationship between "being able to see clearly" and "being able to understand" in visual AI?
Zhu Caizhi:Being able to see clearly and being able to understand are the two core challenges in the field of visual AI! In the rapidly developing field of visual technology today, many companies are focusing on solving the problem of "understanding", which covers multiple aspects such as facial recognition, precision inspection of industrial components, efficient object recognition of logistics robots, and intelligent monitoring of violations in transportation systems. These applications all rely on advanced image and video capture technology, as well as highly intelligent data analysis capabilities. Together, they form the core of machine vision systems, aiming to replace humans with more accurate and efficient judgments and decisions through the intelligent eyes of machines.
However, in pursuit ofWhile being able to understand, we must not overlook the fundamental and crucial issue of being able to see clearly. If 'understanding' is the intelligent brain of machine vision, then 'seeing clearly' is its keen eyes for perceiving the world. Once there is a problem with the data at the source - the image acquisition stage, such as image blur caused by insufficient lighting, no matter how advanced the subsequent analysis algorithms are, accurate and reliable conclusions cannot be drawn.
In practical applications,The challenge of 'seeing clearly' is everywhere. The darkness at night, the concealment of military areas such as border defense, the complexity of all-weather monitoring such as port construction, and the monitoring needs of vast areas such as forest fire prevention all make supplementary lighting an impossible or undesirable choice. In these scenarios, machine vision systems must rely on their own perceptual abilities to maintain clear imaging effects even under extreme conditions.
The reason why 'seeing clearly' has become a difficult problem to overcome is that it is strictly constrained by physical limits. The performance of core components such as lenses and sensors directly determines the clarity and detail representation of images. In low light environments, the number of photons decreases sharply, and the loss of sensors in the photoelectric conversion process also increases. In addition, interference factors such as thermal noise from electronic components make the final output signal extremely weak, even difficult to recognize.
In order to break through this physical limit, the InterLingDa team conducted a six-year in-depth study, thoroughly analyzing every aspect of camera imaging, from light incidence, lens focusing, sensor photoelectric conversion, to signal amplification, quantization, and transmission, each step was carefully examined and optimized. At the same time, fully utilizing the advantages of artificial intelligence technology, intelligent enhancement and restoration of images are carried out through algorithms such as deep learning, thereby improving the clarity of images at the source.
Experts from multiple fields including optics, computer vision, hardware design, and chip manufacturing participated in this project, each leveraging their strengths to overcome one technical challenge after another. During this process, we not only published a series of world-class academic papers and patent achievements, but also successfully designed and manufactured chips and products with independent intellectual property rights.
Not just solving technical pain points, but also visualIn which industries will AI bring about some changes?
Zhu Caizhi: In low light environments, ordinary cameras often have difficulty reproducing real colors. andAI image intelligent recognition technology can analyze and process the color information of images to restore more realistic and natural color effects. AI algorithms can recognize target objects and people in images in real time, and track and record them.
In addition to target recognition,AI algorithms can also intelligently analyze human behavior in images. For example, it can recognize abnormal behaviors such as running and falling, and automatically trigger alarms. This can not only improve the accuracy and response speed of monitoring, but also effectively prevent the occurrence of safety accidents.
At the level of commercial applications, China is in this waveLeading the world in the wave of AI. The computer vision (CV) direction is more sought after by domestic capital and entrepreneurs. Security was an early application scenario that everyone unanimously chose, but traditional security manufacturers such as Hikvision and Dahua have more advantages. Afterwards, industries such as industrial defect detection, medical assisted diagnosis, robotics, and assisted driving also attracted visual AI companies to join. At the same time, traditional security is also becoming more connected to the Internet of Things, and there are many sub scenarios of pan security.
So we can introduce multiple practical scenarios. Firstly, the mining environment is a typical example, and due to safety and lighting limitations, a lighting solution that can ensure brightness without causing safety issues is needed. Meanwhile, the issue of equipment wear and tear in mines cannot be ignored, making low light imaging technology particularly important.
In addition, flashing lights on highways are another application scenario. These flashing lights provide additional illumination to capture license plates when vehicles pass quickly, but strong light can cause inconvenience to drivers and result in significant equipment damage. Low light imaging technology can improve shooting clarity without producing strong light, thus solving this problem. Open scenes such as ports and docks also require24-hour monitoring and identification capability. At night, due to poor imaging performance, the recognition rate decreases, while low light imaging technology can maintain a high recognition rate.
In addition to the above scenes, in criminal investigation, the low illumination imaging technology can help the police capture clear images at night, so as to lock the suspect. In terms of preventing illegal activities such as hunting and stealing fish, low light imaging technology can monitor areas such as lakes and reservoirs to prevent dangerous behavior at night. In terms of industrial defect detection, low light imaging technology can greatly improve detection accuracy and reduce labor costs.
In consumer products, visualThe combination of AI with intelligent hardware and smart home products, including smart doorbells, smart door locks, baby care, etc., has a large market overseas. Domestic operators are also vigorously promoting the implementation of visual AI, such as in kitchen lighting, drowning prevention, high-altitude throwing, smoke and fire recognition and other application scenarios.
Although visuallyAI has achieved great development and popularization, but are there any factors currently restricting its development?
Zhu Caizhi: AlthoughThe development of AI vision technology is rapid, and there is still potential to be explored in existing hardware such as computing and imaging devices, but it still faces many challenges.
Firstly, the improvement of hardware infrastructure is an important aspect. With the continuous advancement of technology, computing power will continue to improve, enabling algorithms to run under fewer physical conditions, resulting in more complex algorithm effects and better performance. The computing power isThe foundation of AI vision technology operation. With the continuous development of algorithms such as deep learning, the demand for computing power is also increasing. At present, although computing power equipment is constantly improving, there are still challenges in terms of cost, heat generation, and energy efficiency ratio. For example, although high specification computing devices have powerful performance, they are expensive and have high energy consumption, which is not conducive to large-scale promotion and application. Therefore, how to reduce costs and energy consumption while ensuring performance is the key to the development of computing devices.
In addition, the development of electronic components is also crucial, including the improvement of hardware conditions such as lenses, which will contribute toThe development of AI vision technology provides strong support. Imaging devices are the windows through which AI vision technology obtains information. With the continuous advancement of sensor technology, the resolution, frame rate, color reproduction and other indicators of imaging devices are constantly improving. However, the development of imaging equipment is also limited by physical conditions such as lens distortion, lighting conditions, etc. Therefore, how to improve the imaging effect through algorithm optimization and other means under the limited physical conditions is the key to the development of imaging equipment.
With the continuous optimization of algorithms,The performance and accuracy of AI vision technology are constantly improving. However, algorithm optimization also faces challenges in terms of computational complexity and data dependency. How to reduce computational complexity and dependence on data while ensuring performance is the key to algorithm optimization.
visionThe development of AI technology requires interdisciplinary support. For example, combining knowledge from disciplines such as physics and biology can further enhance imaging effectiveness and algorithm performance. However, interdisciplinary research requires close collaboration between experts from different fields, which places higher demands on resource integration and team collaboration.
How do you plan the future development direction of InterLingda, and what do you hope for visuallyWhat innovations will AI bring in the future?
Zhu Caizhi: As an innovative team with a certain level of popularity and strength, InterLingDa is well aware of its own advantages and potential. We believe that the opportunity lies in becoming a platform based company for visual innovation, filling the gap in China's forward-looking capabilities in this field.
Currently, many world-classAI companies such as SenseTime tend to focus on developing businesses such as big models and autonomous driving, while neglecting the needs of many industry segments. However, we firmly believe that the future world will be dominated by large models as the 'brain', while small models at the edges will serve as the collection terminals for the physical world.
These collection terminals, such as the smoke alarm we developed, will be based on vision as the entrance and become an important component of various sensors. These sensors will be distributed in every corner, using vision to define and perceive the physical world.
As a platform based company, we have mastered all the elements of end-to-end innovation, including hardware design capabilities, core algorithm design capabilities, etc. We will be committed to supporting experts in the industry, meeting their product needs, and promoting the development of personalized and segmented products. This is a market opportunity worth billions, and we are confident in it.
We focus on improving the imaging quality and intelligent recognition capability of cameras, so that they can not only see clearly but also understand. We pass this information to the backend big model for overall planning and decision-making.
'Can see clearly' and 'can understand' are not simply equivalent relationships. Therefore, in the research and development process, InterLingDa combines these two issues and adopts an end-to-end solution, which is our strength.
In the future, InterLingDa will provide core technologies, including algorithms, chips, hardware design solutions, etc., while hardware products and sales channels will rely on partners to provide. Our goal is to jointly create intelligent products that differentiate business travel and target the mid to high end incremental market, includingB-end and C-end.
stayIn the B2B market, we mainly cooperate with industry giants such as Huawei, leveraging their sales channels and brand influence to promote our high-quality algorithms and adaptive hardware to a wider user base. In the C-end market, we are also optimistic about its potential, especially in overseas markets. We will work closely with solution companies in the Pearl River Delta to leverage manufacturing advantages, increase gross profit margins and market share for ecological partners, and jointly eliminate the phenomenon of disorderly competition and internal competition.
Laos:+856 2026 885 687 domestic:+0086-27-81305687-0 Consultation hotline:400-6689-651
E-mail:qingqiaoint@163.com / qingqiaog5687@gmail.com
Copyright: Qingqiao International Security Group 备案号:鄂ICP备2021010908号