Hi everybody, dear riddle-solvers,
this is my first post, so please bear with me if there is some inconsistencies or ambiguities in this.
I work for a large organisation that, in course of it's business intelligence activities, performs large-scale surveys.
We use Excel to handle this data - not ideal, I know, but not the reason for this post. Unfortunately, I cannot share the actual data with you, for privacy reasons.
Every once in a while, we add new questions to the questionnaire, which will appear as new columns in our database (columns = questions; rows=respondents).
In order to implement the newly added questions into the database, I have to compare the raw data-"database" with the existing one and, where there is a new question in the questionnaire add a new column in, that will later include all the data.
Example:
Raw_quest_1: Q1, Q2, Q3,Q4,Q5,Q6,Q7
Database_1: Q1, Q2, Q3,Q4,Q5,Q6,Q7
After one year new questions were introduced:
Raw_quest_2: Q1, Q1.1, Q1.2, Q1.3, Q2, Q3,Q4, Q4.1, Q4.2 ,Q5, Q5.1, Q5.2, Q5.3, Q5.4 ,Q6, Q7 Here you see, questions were added.
Database_2: Q1, Q1.1, Q1.2, Q1.3, Q2, Q3,Q4, Q4.1, Q4.2 ,Q5, Q5.1, Q5.2, Q5.3, Q5.4 ,Q6, Q7 Here you see, the database was adjusted.
After another year, more new questions were added, other ones abandoned:
Raw_quest_2: Q1, Q1.3, Q2, Q2.1, Q2.2, Q2.3, Q3, Q4, Q4.1, Q4.2 ,Q5, Q5.1, Q5.2, Q5.3, Q5.4 ,Q6, Q6.1, Q6.2, Q6.3, Q7 Here you see, questions were added.
Database_2: Q1, Q1.1, Q1.2, Q1.3, Q2, Q2.1, Q2.2, Q2.3, Q3, Q4, Q4.1, Q4.2 ,Q5, Q5.1, Q5.2, Q5.3, Q5.4 ,Q6, Q6.1, Q6.2, Q6.3, Q7 Here you see, the database was adjusted.
After another year, more new questions were added, other ones abandoned:
Raw_quest_2: Q1, Q1.3, Q2, Q2.1, Q2.2, Q2.3, Q3, Q4, Q4.1, Q4.2, Q5, Q5.1, Q5.2, Q5.2.1, Q5.2.2, Q5.2.3, Q5.3, Q5.4 ,Q6, Q6.1, Q6.2, Q6.3, Q7 Here you see, questions were added.
Database_2: Q1, Q1.1, Q1.2, Q1.3, Q2, Q2.1, Q2.2, Q2.3, Q3, Q4, Q4.1, Q4.2, Q5, Q5.1, Q5.2, Q5.2.1, Q5.2.2, Q5.2.3 Q5.3, Q5.4 ,Q6, Q6.1, Q6.2, Q6.3, Q7 Here you see, the database was adjusted.
So, here comes what I need to do, and I have no idea, where to start even:
- I need a Macro that compares the new data (which is in another sheet than the database, however is in the same workbook) with the database left to right, taking into account abandoned questions, newly added questions and empty columns (which just serve the purpose of separating stuff thematically in the database, but do not exist in the raw data).
- I always appreciate a macro that is "adjustable", so let's me select the area I want it to operate in, as the database has some extra information around that doesnt make it start at A1, but at a different point, whereas the raw data starts at A:1.
- Whilst it is relatively unimportant what line the data is in, the columns have to always be identical between database and questionnaire (so if a question is in column AAC in the database, it should be or become column AAC in the questionnaire; If a question was in Column ZZ before in the database, however moved due to insertion of columns, the same number of columns in the same places have to be added into the questionnaire)
Example:
Database (called 'Saisie'):
Q1, Q1.1, Q1.2, Q1.3, Q2, Q2.1, Q2.2, Q2.3, Q3, Q4, Q4.1, Q4.2, Q5, Q5.1, Q5.2, Q5.2.1, Q5.2.2, Q5.2.3, Q5.3, Q5.4 ,Q6, Q6.1, Q6.2, Q6.3, Q7
Questionnaire (let's say called Questionnaire_new):
Q1, Q1.1, Q1.2, Q1.3, Q2, Q2.2, Q2.3, Q3, Q3.1, Q3.2, Q3.3, Q3.4, Q4, Q4.1, Q4.2, Q5, Q5.1, Q5.2.1, Q5.2.2, Q5.2.3, Q6, Q6.1, Q6.2, Q6.3, Q7
What I would need this macro to do is:
Q1=Q1 --> no action
Q1.1 = Q1.1 --> no action
.
.
.
Q2 = Q2 --> no action
Q2.1 is missing in questionnaire --> insert column in raw data after Q2.1, name the topmost cell in this selection (aka the header line) "Q2.1" as specified in the database
Q2.2 = Q2.2 --> no action
.
.
.
Q3 = Q3 --> no action
Q3.1 is missing in database --> insert column after Q3, name the topmost cell in this selection (aka the header line) "Q3.1" as specified in the questionnaire
Q3.2 is missing in database --> insert column after Q3.1, name the topmost cell in this selection (aka the header line) "Q3.2" as specified in the questionnaire
Q3.3 is missing in database --> insert column after Q3.2, name the topmost cell in this selection (aka the header line) "Q3.3" as specified in the questionnaire
Q3.4 is missing in database --> insert column after Q3.3, name the topmost cell in this selection (aka the header line) "Q3.4" as specified in the questionnaire
.
.
.
Q4 = Q4 --> no action
Spacer_column in database, however not in raw_data ---> insert column after Q4
.
.
.
Q5.2.3 = Q5.2.3 --> no action
.
.
.
Q5.3 is missing in questionnaire --> insert column in raw data after Q5.2, name the topmost cell in this selection (aka the header line) "Q2.3" as specified in the database
.
.
.
until the end of the selection
I hope someone can help me with that, last time I spent two weeks 8 hours a day inserting and renaming columns...
Thank you so much for ANY help, that can help me make my life with this a bit easier...
Cheers,
Bookmarks