Speculative parallelization aggressively executes in parallel codes that cannot be fully parallelized by the compiler. If the hardware detects a cross-thread dependence violation at run time, it squashes offending threads and reverts to a safe state.In the first part of the talk we present our past proposal for a scalable speculative NUMA system, using speculative CMPs as the building block. In the second part of the talk we present our current on-going work with a new approach to reduce the cost of handling cross-thread data dependence violations in scalable multiprocessors. This approach is based on run-time learning and uses a set of techniques that progressively handle the spectrum of data dependences: from sparse false dependences to dense same-word dependences. In the last part of the talk we discuss our proposal for a new research project to develop new software and hardware infrastructures for speculative parallelization in CMPs, including an automatic speculative parallelization pass.