Researchers have found that having a NN with too many
nodes allows the network to memorize the patterns being
presented, thus eliminating the need to extract dominant
patterns and features from the training samples.
Although this causes very good outputs when the training
samples are presented during the testing phase,
performance decreases (usually below acceptable levels)
when novel patterns are shown to the network ( see,
for example, Caudill (1990) ). At the same time, since
it is the hidden nodes that act as memory, not having
enough nodes can deprive the network of the ability to
remember factors that might be important for the task
at hand. Deciding how many hidden nodes for a problem
like sentence parsing is still an open question.
|