In speech production, no segment within connected speech can have its surface output in isolation. Rather, the input necessarily undergoes moderations through close interactions with the adjacent/neighbouring segments. In speech, the discrete, invariant unit (input) gets obscured with overlapping boundaries at both the articulatory and acoustic levels. Coming down to the word domain, it is universal that vowels interact with each other (V-to-V interaction) even across consonants that often can act upon the vowel gestures (C-to-V interaction). While vowels have global gestures, consonants have local ones. But, since gesture may be both articulatory (time-based) and acoustic (formant-based), the intersegmental interaction can be manifested through some overlap that is pivotal to coarticulation (CoA). It can be defined as an overlap between the global (vocalic) and local (consonantal), or even between articulatory gestures of vowels. This interaction can take place even at the phonological level, e.g. when betrayed through feature spreading. Coarticulation models and theories that have evolved in the last 60 years try to define the nature of this transition from the discrete input to the variability of articulation of an output. This paper is a critical review of these recent models dealing with the variable, indiscrete outputs at the production level. CoA is such a complex phenomenon with different aspects like articulation, acoustics, time, gesture, feature etc., that any single theory or model fails to capture. It is an attempt to look into the incompleteness and inadequacies of these models that point out the need for a composite CoA model. [250 Words]