The Second Intelligent Species How Humans Will Become as Irrelevant as Cockroaches
Chapter 5 - How Computer Vision Systems will Destroy Jobs
by Marshall Brain
If you look back at the description of self-driving cars in the previous chapter, notice that computer vision does not really play a role. Current self-driving cars do not have two eyes on the roof or the hood looking out at the road and deciding what to do based on visual input. Self-driving cars do have an optical camera, but it plays a small role. For example, it helps the car decide if a traffic light at an intersection is red or green.
This might seem odd to many people. When humans drive a car, visual input through our eyes is essential. Why don't self-driving cars do it the same way? Why doesn't a self-driving car use optical cameras and binocular vision in the same way that human beings use their eyes to sense the world?
Instead of cameras, a self-driving car uses different sensors to detect the world around it. LIDAR and radar are the two most essential sensor packages on a self-driving car.
There is a simple reason for this difference: the computer vision systems that exist in production today (2015) are still fairly primitive. Computer scientists still have a ways to go when it comes to perfecting general vision systems. Yes, there are simple things that computer vision systems can do (for example, this video shows a simple camera system to detect pancakes on a conveyor belt). But at this moment in history, there is not a computer vision system that can look at a common scene of a farm and say, “that is a barn, that is a horse, that is a man, that is the man's hat, that is grass, that is a tree, etc.” A five-year-old human child can do that easily, but computers are not there yet except in special situations.
In the same way, it is not currently possible to put a camera on the front of a car and have a computer use the pictures from the camera to identify other traffic, lane markings, bicyclists, pedestrians, dogs wandering into the street, etc. Computer scientists simply have not created the algorithms yet for computer vision at that level. But research in this area is occurring on many different fronts, both for the general case and specific situations. In the same way that Chess-playing computers eventually beat human players after several decades of research, and a Jeopardy-playing computer beat the best human players, there will eventually be computers running algorithms that are better than human beings at seeing the world. We simply haven't arrived there yet.
The thing to understand is that we will arrive there eventually. As this computer vision research bears fruit, a surprising thing will happen. It turns out that there are many sectors of the economy that will come under new pressure from robots and automation once robots can see the world. Once robots can see things like human beings do, it opens up whole new areas for robots to take over human jobs.
If you think about it, you realize that certain human jobs – jobs that are not particularly difficult to perform otherwise – have been protected from automation because of their dependence on vision. Imagine that you have a time machine and you are able to travel back in time to the year 1950. If you walk into a restaurant, hotel or store in 1950, it would be nearly identical to a restaurant, hotel or store today from an employee perspective. If you go to a store or restaurant in 1950 and a store today, people do everything: people stock the shelves, prepare the food, serve the food, help customers, man the cash registers, clean the toilets and sweep the floors today in much the same way as they did in 1950. It's the same on any construction site. In 1950, guys with circular saws and hammers built houses. Today it is guys with circular saws and nail guns. No big difference. Similarly, a hotel today has clerks at the front desk and people doing all of the cooking and cleaning just as it did in 1950. An amusement park in 1950 looks much like any amusement park today, with people operating the rides, selling the concessions and keeping the park clean.
Industries like these are, by and large, untouched by automation today. These people-powered industries represent at least half of the jobs in the American job pool.
Think about the people who work in a typical retail store like Wal-Mart, Target, PetSmart, Home Depot, and so on. The jobs these people are doing are not particularly demanding, but they do require vision. Think about tasks like:
straightening the shelves
stocking the shelves
sweeping the floors
moving shopping carts out of the parking lot
scanning for shoplifters
cleaning up a spill
checking inventory levels
A robot with a good vision system and decent dexterity could easily do these jobs. And there are many entrepreneurs and corporations willing to develop these new retail staffing robots because there is a huge marketplace and a big economic incentive. Therefore, the human beings who do these jobs now – Wal-mart alone has approximately two million people in these kinds of jobs – will all start being displaced once the vision systems are available. Every retailer will start replacing human workers with robots because the robots will be better and cheaper than human labor.
That same trend will be happening at about the same time in the restaurant industry. We can already see the leading edge of the process as restaurants start installing more and more kiosks and tablets to take orders. Millions more workers will be displaced as restaurants fully automate.
The same thing will be happening in the construction industry. Laying bricks, nailing down shingles, painting walls, cutting and setting tile, putting up drywall, nailing together a wall - All of these jobs become easy for robots once a good vision system exists. Millions of construction workers will be replaced.
Factories are already highly automated. For example, in a car factory, robots do all the welding and painting. Many of the factory jobs that remain have not been automated because they require vision. Putting a wiring harness into an automobile on an assembly line is done by humans today because humans can see and easily handle flexible materials. Most other human jobs that remain in an auto assembly factory require vision in the same way. Once robots can see, all of those factory jobs will start going to robots just like all of the welding, painting and machining jobs that are already automated.
Think about all of the custodial jobs in hotels, arenas, college campuses, office parks and homes. With robots that can see, it is possible to clean things, and all of the custodians, janitors and maids start getting replaced.
The point is that there is one key technology – vision – that is holding back the automation of millions of jobs in America. Just the five categories discussed above (retail, restaurants, construction, custodial and manufacturing) represent tens of millions of jobs. Once vision systems become affordable and get deployed in any real sense, then tens of millions of Americans will lose their jobs over the course of a few years.
Lacking good vision systems, it is fascinating to see how the automated economy is responding. Human beings are turning into wetware inside of a computerized system. We have all heard of hardware - hardware is the silicon chips that do the computation. And software - software is the computer programs that run on the hardware. Wetware is the human beings that plug into systems to do the things that software and hardware cannot yet do. We have seen a great example of wetware - and its elimination - in telephone 411 services. A couple of decades ago, human beings handled every part of a 411 call. When a customer dialed 411, a human being would say hello, ask what number you needed, listen to the request, perhaps ask a question or two ("Do you want the store on First Street or Elm street?"), look up the number, and say the number. Then computers took over the process of saying hello and saying the number, with a human being handling the entire middle of the call much as before. Then humans in the system became wetware, handling less and less of the call, to the point where all a human would do is listen to and interpret the spoken request. Today, computers can handle the entire transaction. Systems like Siri can understand speech so well that humans are no longer needed. And we see more complex systems in much more demanding tasks. If you call an airline or an insurance company, much of the call is handled by computers. Humans only get involved in special cases. Many airlines now charge extra when humans have to get involved. See the free book Manna for other examples.
Meanwhile, computers using Watson-like technologies will be taking jobs from doctors, lawyers, accountants, teachers, etc. in much the same way. The teachers represent a particularly important example, because there is giant economic pressure to eliminate their jobs. In addition, computers are likely to create better, more personalized, more creative forms of education. MOOCs (Massive Online Open Classes) are showing that new methods of teaching are possible on the Internet. MOOCs are one approach out of myriad new teaching approaches that entrepreneurs will try. Teachers in elementary schools, secondary schools and colleges represent something like 4 million jobs that are at risk.
How many jobs will the U.S. economy lose in the near term to various types of robots as they are empowered with vision systems? Let's add it up using the examples described above:
So now, as all of these millions of people start becoming unemployed, think about how American society will respond to the situation. There really is no response possible in the current political climate without a new way of thinking and behaving economically. Think of all of the people who became unemployed in prior U.S recessions. There is a very meager monthly check available in the form of unemployment benefits if you become unemployed, and this stipend has a time limit on it. Once that runs out, an unemployed person burns through their savings and then what? They become homeless, or move back in with their parents, or depend on a spouse's income, or sleep on a friend's sofa.
Or think about all of the people who lost their factory and textile jobs when those factories moved overseas. They were able to move toward jobs at Wal-Mart and Target. But those jobs paid less and often lacked benefits, and those service-sector jobs will soon be evaporating as well.
This is why I said in Chapter 2 that America is about to enter a painful, embarrassing era where the nation's wealth is concentrating rapidly and large numbers of people are suffering because of unemployment. We can hope that the second intelligent species arrives to end the suffering, but that will likely take time. The process of robotic job replacement is going to happen very quickly once inexpensive computer vision systems become available, and there may be more than a decade of intense suffering in the interim. The rich will get far richer. The middle class will largely disappear and then there will be an ocean of poverty.
The amazing thing is that, as all of these workers are becoming unemployed, there will be vast amounts of wealth being created by the robots and automation. Where will all of that wealth go? It will go to a very small slice of the population in a process called the concentration of wealth...