A method of parallelizing a pipeline includes stages operable on a sequence of work items. The method includes allocating an amount of work for each work item, assigning at least one stage to each work item, partitioning the at least one stage into at least one team, partitioning the at least one team into at least one gang, and assigning the at least one team and the at least one gang to at least one processor. Processors, gangs, and teams are juxtaposed near one another to minimize communication losses.