The general consensus is that the lack of a single standard for Quadraphonic sound on the vinyl disc was the major stumbling block that impeded the adoption of Surround Sound as the norm for audio (Quad on tape [more tracks] and FM [more frequency division multiplexing] was easy by comparison).
The "market confusion" of several different and incompatible Quad systems was certainly a huge factor in it's commercial failure, but there were at least two other significant negative factors.
1. The original Quadraphonic speaker plan placed the listener in the dead-center of basically a square array with a speaker in each corner facing the listener. The "sweet spot" was one seat, and the average living room didn't support the speaker plan well at all, especially with that one seat in the middle of the room. Moving off dead-center pretty much confounded any hope of realistic spatial reproduction, especially with the two most marketed Quad systems: QS and SQ. Discrete quad from 8-track or R-R tape worked better, but was far less available. CD-4 came in late, never fully penetrated the market because it added yet another hardware requirement (special cartridge, stylus, and decoder) to the already behemoth Quad system, suffered from rapid degradation with repeated playings, and demanded that record companies produce recordings in "double inventory", standard stereo and CD-4, as a CD-4 disc would be ruined by a standard stereo stylus.
2. The burden on the listener to buy 4 identical speakers and a huge 4-channel receiver, or array of separates for a hard to perceive return on investment was the biggest marketing hurdle. Do you get a better result from 4 cheaper speakers or two more expensive ones? Yeah, the two won most of those battles.
Home Theater surround was a different story. There was first a large library of content already being released even before there were consumer decoders (VHS HiFi, Beta HiFi, and Laserdisc all had stereo tracks) as the home video transfers were derived from the LtRt matrixed theatrical tracks, and could be easily decoded. Discrete or matrixed 5.1, or even matrixed 3.1 solved problem #1 with a better speaker plan that includes a hard center for a very wide listening window, a clear "front" location driven by the presence of a video display, and smaller satellite speakers with a subwoofer, which are far easier to place, and provide better sound at lower cost. Distances are compensated for during setup (something never provided for in Quad), and all 5.1 coding systems play equally well on the standard ITU array.
I was involved with some early FM Quad live broadcasts in the 1970s. The encoder was a Sansui QS encoder with in-house modifications for better compatibility with SQ decoders. These were concerts with Lr and Rr being fed from suspended mics over the audience. The presentation was good, and there were a few Quad listeners, but only a few. Stereo compatibility was mostly retained too. Mono compatibility? Meh. In the mid 1980s I was involved with another live FM broadcast, this time mixed with the Shure surround encoder (basically a licensed custom version of the Dolby Stereo 4-2-5.1 matrix process with claimed higher performance) and again with Ls and Rs fed with surround mics. The boadcast could have been decoded on any home theater system equipped with a Dolby Surround or Shure decoder. The broadcast went well, but the exta effort in production was extreme, and thus not attempted again.
The argument against Quad of "we have two ears, not four" only works for people with no understanding of spatial hearing, which was the general status of the market at the time.
In the 1930s, Bell Labs conducted some famous experiments in multichannel audio transmission. They favored a massive array of microphones and corresponding transmission channels and speakers, but also concluded that the absolute minimum number of speaker channels required for acceptable stereo was 3, arranged in Left, Center, and Right. Two channels was deemed inadequate because of the fragile phantom center between two speakers, that is highly variable in apparent location with changes in level and head position. Remember, "stereo" does not mean "two", but stems from the Greek word meaning "solid", implying dimensionality and physical position. The reason we got stuck with only two channels has to do with the impracticality of distributing more that two discrete channels on a grooved disc. Spatial hearing processes localization cues from anywhere in a full sphere, which is something even Quad could not replicate, and can only be faked to a minimal extent in a highly restricted listening window with two speakers.