+ Reply to Thread
Results 1 to 15 of 15

String parsing with variable lenght strings

  1. #1
    frosterrj
    Guest

    String parsing with variable lenght strings

    I am trying to split up a cell into numbers and charachters and place them in
    separate columns, but the lenght of the number part varies as does the
    content of the character part.

    For example, one cell could be 5#, 10Tins, 4 lb, 100Pcs, etc.
    I would like to be able to pull out the #, Tins, lb, and Pcs in to their
    own column.

    I have found this formula:
    =LEFT(A1,FIND("-",A1,1)-1)

    but it assumes some level of consistency, the "-" in the cell.

    Any ideas appreciated.

    Robert

  2. #2
    Ron Coderre
    Guest

    RE: String parsing with variable lenght strings

    Perhaps something like this:

    For a value in A1

    The numeric (left) part:
    B1:
    =--LEFT(A1,MATCH(FALSE,ISNUMBER(--MID(A1,ROW($A$1:INDEX(A:A,LEN(A1))),1)),0)-1)

    The unit of measure (right) part:
    C1:
    =TRIM(MID(A1,MATCH(FALSE,ISNUMBER(--MID(A1,ROW($A$1:INDEX(A:A,LEN(A1))),1)),0),255))

    Note: Commit those array formulas by holding down the [Ctrl][Shift] keys
    when you press [Enter].

    Copy those formulas down as far as you need.

    Does that help?

    ***********
    Regards,
    Ron

    XL2002, WinXP-Pro


    "frosterrj" wrote:

    > I am trying to split up a cell into numbers and charachters and place them in
    > separate columns, but the lenght of the number part varies as does the
    > content of the character part.
    >
    > For example, one cell could be 5#, 10Tins, 4 lb, 100Pcs, etc.
    > I would like to be able to pull out the #, Tins, lb, and Pcs in to their
    > own column.
    >
    > I have found this formula:
    > =LEFT(A1,FIND("-",A1,1)-1)
    >
    > but it assumes some level of consistency, the "-" in the cell.
    >
    > Any ideas appreciated.
    >
    > Robert


  3. #3
    az94
    Guest

    Re: String parsing with variable lenght strings

    It's a bit ugly, but it seems to work. Assuming your values start in
    A1 and go down, the following formula in cell B1 will break out the
    number.

    =IF(ISERR(VALUE(LEFT(A1,2))),VALUE(LEFT(A1,1)),IF(ISERR(VALUE(LEFT(A1,3))),VALUE(LEFT(A1,2)),IF(ISERR(VALUE(LEFT(A1,4))),VALUE(LEFT(A1,3)),IF(ISERR(VALUE(LEFT(A1,5))),VALUE(LEFT(A1,4)),"Check
    num length"))))

    This will test the first two characters, then the first 3, then the
    first 4 etc. to see if it is a number, and when it hits a length that
    includes an alpha, it will drop down a character and return the value.
    I stopped at 4 digits, but obviously you can extend it as much as you
    need.

    Then you can use the following formula in cell C1 to give you the text.

    =RIGHT(A1,LEN(A1)-LEN(B1))

    Note: The second formula will include a space if there is one. You
    may want to remove all spaces before you begin to avoid that.

    Hope that helps...


  4. #4
    Ron Rosenfeld
    Guest

    Re: String parsing with variable lenght strings

    On Tue, 28 Mar 2006 12:04:02 -0800, frosterrj
    <[email protected]> wrote:

    >I am trying to split up a cell into numbers and charachters and place them in
    >separate columns, but the lenght of the number part varies as does the
    >content of the character part.
    >
    >For example, one cell could be 5#, 10Tins, 4 lb, 100Pcs, etc.
    >I would like to be able to pull out the #, Tins, lb, and Pcs in to their
    >own column.
    >
    >I have found this formula:
    >=LEFT(A1,FIND("-",A1,1)-1)
    >
    >but it assumes some level of consistency, the "-" in the cell.
    >
    >Any ideas appreciated.
    >
    >Robert


    Ron C.'s formulas should work.

    For a different approach, and to gain access to a number of other useful
    functions, you could download and install Longre's free morefunc.xll add-in
    from http://xcell05.free.fr/

    Then use the regular expression formulas:

    For the number:
    =REGEX.MID(A1,"[0-9]+")

    (means return the first series of characters that are in the class [0-9].

    For the units:
    =REGEX.MID(A1,"[^0-9 ]+")

    (means return the first series of characters that are not in the class [0-9 ]
    -- note that there is a <space> included in the expression, so that any space
    in between the number and the units will not be returned).


    --ron

  5. #5
    Harlan Grove
    Guest

    Re: String parsing with variable lenght strings

    frosterrj wrote...
    >I am trying to split up a cell into numbers and charachters and place them in
    >separate columns, but the lenght of the number part varies as does the
    >content of the character part.
    >
    >For example, one cell could be 5#, 10Tins, 4 lb, 100Pcs, etc.
    >I would like to be able to pull out the #, Tins, lb, and Pcs in to their
    >own column.

    ....

    Does each such record contain only 4 fields? And does every field
    contain both initial numbers followed by text? And are all fields
    separated by commas and possibly spaces?

    If so, then define a name like seq referring to

    =ROW(INDEX(Sheet1!$A:$A,1):INDEX(Sheet1!$A:$A,256))

    Then if your record were in cell A1, either copy it into cell A2 then
    select A2:A3 and run Edit > Replace to replace all spaces with nothing,
    thus deleting all spaces, or enter the following formula in A2.

    A2:
    =SUBSTITUTE(A1," ","")

    Then try these formulas.

    A3:
    =--LEFT(TRIM(A2),LOOKUP(2,1/ISNUMBER(-MID(A2,1,seq)),seq))

    B3:
    =MID($A2,SUMPRODUCT(LEN($A3:A3))+COUNT($A3:A3),FIND(",",$A2&",",
    SUMPRODUCT(LEN($A3:A3))+COUNT($A3:A3))-SUMPRODUCT(LEN($A3:A3))
    -COUNT($A3:A3))

    C3:
    =--MID($A2,SUMPRODUCT(LEN($A3:B3))+COUNT($A3:B3,1),
    LOOKUP(2,1/ISNUMBER(-MID($A2,SUMPRODUCT(LEN($A3:B3))
    +COUNT($A3:B3,1),seq)),seq))

    Copy B3 and paste into D3, then copy C3:D3 and paste into E3:F3, G3:H3,
    etc.

    Alternatively, download and install Laurent Longre's MOREFUNC.XLL
    add-in, freely available from http://xcell05.free.fr/english/ , and use
    its REGEX.MID and REGEX.SUBSTITUTE functions like so.

    A3:
    =--REGEX.MID($A1,"\d*\.?\d+",(COLUMNS($A3:A3)+1)/2)

    B3:
    =REGEX.SUBSTITUTE($A1,"([^,]+, *){"&(COLUMNS($A4:B4)/2-1)&
    "}\d*\.?\d+ *([^,]+).*","[2]")

    Copy A3:B3 and paste into C3:D3, E3:F3, G3:H3, etc.


  6. #6
    Harlan Grove
    Guest

    Re: String parsing with variable lenght strings

    Ron Rosenfeld wrote...
    ....
    >Then use the regular expression formulas:
    >
    >For the number:
    >=REGEX.MID(A1,"[0-9]+")


    It works, but \d+ is more compact.

    >(means return the first series of characters that are in the class [0-9].
    >
    >For the units:
    >=REGEX.MID(A1,"[^0-9 ]+")


    It picks up the commas as well. Those could be excluded with a positive
    lookahead assertion and a comma appended to the string.

    B3:
    =REGEX.MID($A1&",","[^0-9 ]+(?=,)",COLUMNS($A3:B3)/2)


  7. #7
    frosterrj
    Guest

    Re: String parsing with variable lenght strings

    Actually, I am having the most trouble because the numbers can contain
    decimals:
    7.63 lb, 66.5 oz etc.
    Seems to be stopping at the '.' .

    "Harlan Grove" wrote:

    > frosterrj wrote...
    > >I am trying to split up a cell into numbers and charachters and place them in
    > >separate columns, but the lenght of the number part varies as does the
    > >content of the character part.
    > >
    > >For example, one cell could be 5#, 10Tins, 4 lb, 100Pcs, etc.
    > >I would like to be able to pull out the #, Tins, lb, and Pcs in to their
    > >own column.

    > ....
    >
    > Does each such record contain only 4 fields? And does every field
    > contain both initial numbers followed by text? And are all fields
    > separated by commas and possibly spaces?
    >
    > If so, then define a name like seq referring to
    >
    > =ROW(INDEX(Sheet1!$A:$A,1):INDEX(Sheet1!$A:$A,256))
    >
    > Then if your record were in cell A1, either copy it into cell A2 then
    > select A2:A3 and run Edit > Replace to replace all spaces with nothing,
    > thus deleting all spaces, or enter the following formula in A2.
    >
    > A2:
    > =SUBSTITUTE(A1," ","")
    >
    > Then try these formulas.
    >
    > A3:
    > =--LEFT(TRIM(A2),LOOKUP(2,1/ISNUMBER(-MID(A2,1,seq)),seq))
    >
    > B3:
    > =MID($A2,SUMPRODUCT(LEN($A3:A3))+COUNT($A3:A3),FIND(",",$A2&",",
    > SUMPRODUCT(LEN($A3:A3))+COUNT($A3:A3))-SUMPRODUCT(LEN($A3:A3))
    > -COUNT($A3:A3))
    >
    > C3:
    > =--MID($A2,SUMPRODUCT(LEN($A3:B3))+COUNT($A3:B3,1),
    > LOOKUP(2,1/ISNUMBER(-MID($A2,SUMPRODUCT(LEN($A3:B3))
    > +COUNT($A3:B3,1),seq)),seq))
    >
    > Copy B3 and paste into D3, then copy C3:D3 and paste into E3:F3, G3:H3,
    > etc.
    >
    > Alternatively, download and install Laurent Longre's MOREFUNC.XLL
    > add-in, freely available from http://xcell05.free.fr/english/ , and use
    > its REGEX.MID and REGEX.SUBSTITUTE functions like so.
    >
    > A3:
    > =--REGEX.MID($A1,"\d*\.?\d+",(COLUMNS($A3:A3)+1)/2)
    >
    > B3:
    > =REGEX.SUBSTITUTE($A1,"([^,]+, *){"&(COLUMNS($A4:B4)/2-1)&
    > "}\d*\.?\d+ *([^,]+).*","[2]")
    >
    > Copy A3:B3 and paste into C3:D3, E3:F3, G3:H3, etc.
    >
    >


  8. #8
    Harlan Grove
    Guest

    Re: String parsing with variable lenght strings

    frosterrj wrote...
    >Actually, I am having the most trouble because the numbers can contain
    >decimals:
    >7.63 lb, 66.5 oz etc.
    >Seems to be stopping at the '.' .


    What is? Rather, which formulas are? The one I proposed below allow for
    decimal fractions.

    >"Harlan Grove" wrote:
    >> frosterrj wrote...

    ....
    >>>For example, one cell could be 5#, 10Tins, 4 lb, 100Pcs, etc.
    >>>I would like to be able to pull out the #, Tins, lb, and Pcs in to their
    >>>own column.

    .....

    And with the following modified string in cell A1

    5.5#, 10Tins, 4 lb, 100.23Pcs

    and the following defined name

    >>If so, then define a name like seq referring to
    >>
    >>=ROW(INDEX(Sheet1!$A:$A,1):INDEX(Sheet1!$A:$A,256))

    ....

    And the following intermediate formula

    >>A2:
    >>=SUBSTITUTE(A1," ","")


    The following formulas

    >>A3:
    >>=--LEFT(TRIM(A2),LOOKUP(2,1/ISNUMBER(-MID(A2,1,seq)),seq))
    >>
    >>B3:
    >>=MID($A2,SUMPRODUCT(LEN($A3:A3))+COUNT($A3:A3),FIND(",",$A2&",",
    >>SUMPRODUCT(LEN($A3:A3))+COUNT($A3:A3))-SUMPRODUCT(LEN($A3:A3))
    >>-COUNT($A3:A3))
    >>
    >>C3:
    >>=--MID($A2,SUMPRODUCT(LEN($A3:B3))+COUNT($A3:B3,1),
    >>LOOKUP(2,1/ISNUMBER(-MID($A2,SUMPRODUCT(LEN($A3:B3))
    >>+COUNT($A3:B3,1),seq)),seq))
    >>
    >>Copy B3 and paste into D3, then copy C3:D3 and paste into E3:F3, G3:H3,
    >>etc.

    ....

    produce the following results (A3:H3).

    5.5 # 10 Tins 4 lb 100.23 Pcs

    And the following formulas

    >>A3:
    >>=--REGEX.MID($A1,"\d*\.?\d+",(COLUMNS($A3:A3)+1)/2)
    >>
    >>B3:
    >>=REGEX.SUBSTITUTE($A1,"([^,]+, *){"&(COLUMNS($A4:B4)/2-1)&
    >>"}\d*\.?\d+ *([^,]+).*","[2]")
    >>
    >>Copy A3:B3 and paste into C3:D3, E3:F3, G3:H3, etc.


    produce the same,

    5.5 # 10 Tins 4 lb 100.23 Pcs

    Doesn't appear to stop at decimal fractions. Or are you referring to
    some of the other responses that took your one sample record as
    exhaustively indicative of your data?


  9. #9
    Ron Rosenfeld
    Guest

    Re: String parsing with variable lenght strings

    On 28 Mar 2006 15:04:06 -0800, "Harlan Grove" <[email protected]> wrote:

    >Ron Rosenfeld wrote...
    >...
    >>Then use the regular expression formulas:
    >>
    >>For the number:
    >>=REGEX.MID(A1,"[0-9]+")

    >
    >It works, but \d+ is more compact.


    I phrased it this way for clarity, especially when comparing with the second
    formula.

    >
    >>(means return the first series of characters that are in the class [0-9].
    >>
    >>For the units:
    >>=REGEX.MID(A1,"[^0-9 ]+")

    >
    >It picks up the commas as well. Those could be excluded with a positive
    >lookahead assertion and a comma appended to the string.


    For some reason, I thought each combination of value/units was in a separate
    cell. If so, commas would not be an issue.
    --ron

  10. #10
    Ron Rosenfeld
    Guest

    Re: String parsing with variable lenght strings

    On Tue, 28 Mar 2006 17:47:10 -0500, Ron Rosenfeld <[email protected]>
    wrote:

    >On Tue, 28 Mar 2006 12:04:02 -0800, frosterrj
    ><[email protected]> wrote:
    >
    >>I am trying to split up a cell into numbers and charachters and place them in
    >>separate columns, but the lenght of the number part varies as does the
    >>content of the character part.
    >>
    >>For example, one cell could be 5#, 10Tins, 4 lb, 100Pcs, etc.
    >>I would like to be able to pull out the #, Tins, lb, and Pcs in to their
    >>own column.
    >>
    >>I have found this formula:
    >>=LEFT(A1,FIND("-",A1,1)-1)
    >>
    >>but it assumes some level of consistency, the "-" in the cell.
    >>
    >>Any ideas appreciated.
    >>
    >>Robert

    >
    >Ron C.'s formulas should work.
    >
    >For a different approach, and to gain access to a number of other useful
    >functions, you could download and install Longre's free morefunc.xll add-in
    >from http://xcell05.free.fr/
    >
    >Then use the regular expression formulas:
    >
    >For the number:
    >=REGEX.MID(A1,"[0-9]+")
    >
    >(means return the first series of characters that are in the class [0-9].
    >
    >For the units:
    >=REGEX.MID(A1,"[^0-9 ]+")
    >
    >(means return the first series of characters that are not in the class [0-9 ]
    >-- note that there is a <space> included in the expression, so that any space
    >in between the number and the units will not be returned).
    >
    >
    >--ron


    Having read that some of your data may optionally include decimal numbers, I
    would change my previous recommendation to:

    Number:

    =REGEX.MID(A1,"[0-9]+(\.[0-9]+)?")

    This assumes that all decimal digits have a number before the decimal point.
    If that is not the case, it can be easily modified.

    Units:

    =REGEX.MID(A1,"[^0-9. ]+")

    This assumes there are not "dots" in the units, as in your examples. Again, if
    there are, the expression can be modified to include them.


    --ron

  11. #11
    Harlan Grove
    Guest

    Re: String parsing with variable lenght strings

    Ron Rosenfeld wrote...
    ....
    >Having read that some of your data may optionally include decimal numbers, I
    >would change my previous recommendation to:
    >
    >Number:
    >
    >=REGEX.MID(A1,"[0-9]+(\.[0-9]+)?")
    >
    >This assumes that all decimal digits have a number before the decimal point.
    >If that is not the case, it can be easily modified.
    >
    >Units:
    >
    >=REGEX.MID(A1,"[^0-9. ]+")
    >
    >This assumes there are not "dots" in the units, as in your examples. Again, if
    >there are, the expression can be modified to include them.


    It gets more difficult if the numbers could include 0. and .0 and if
    the units could include periods and numerals, e.g., 20 reams A4, 5
    boxes .22 shells.

    With just built-in formulas referencing the original record in A1 plus
    the name seq as I defined it earlier,

    A3:
    =LEFT(A1,LOOKUP(2,1/ISNUMBER(-MID(A1,1,seq)),seq))

    B3:
    =MID($A1,SUMPRODUCT(LEN($A3:A3))+COLUMNS($A3:B3)/2,
    FIND(",",$A1&",",SUMPRODUCT(LEN($A3:A3))+COLUMNS($A3:B3)/2)
    -SUMPRODUCT(LEN($A3:A3))-COLUMNS($A3:B3)/2)

    C3:
    =MID($A1,SUMPRODUCT(LEN($A3:B3))+1+COLUMNS($A3:B3)/2,
    LOOKUP(2,1/ISNUMBER(-MID($A1,SUMPRODUCT(LEN($A3:B3))+1
    +COLUMNS($A3:B3)/2,seq)),seq))

    Copy B3 and paste into D3, then copy C3:D3 and paste into E3:F3, G3:H3,
    etc. Note that the number portions are text in the formulas above.

    With MOREFUNC.XLL, divide and conquer.

    A3:
    =--REGEX.MID(REGEX.MID($A1,"[^,]+",(COLUMNS($A3:A3)+1)/2),"\d+(\.\d*)?|\.\d+")

    B3:
    =REGEX.SUBSTITUTE(REGEX.MID($A1,"[^,]+",COLUMNS($A3:B3)/2),
    " *"&REPLACE(A3,2,0,IF(LEFT(A3,1)="0","?",""))&" *","")

    Copy A3:B3 and paste into C3:D3, E3:F3, etc.

    Note: I'm assuming the OP's records are in one cell given the OP's
    statement: "one cell could be 5#, 10Tins, 4 lb, 100Pcs, etc." I'm
    interpretting the 'etc.' to mean there could be more fields in the
    record, not that each of these be in a separate column.


  12. #12
    Ron Rosenfeld
    Guest

    Re: String parsing with variable lenght strings

    On 28 Mar 2006 22:40:41 -0800, "Harlan Grove" <[email protected]> wrote:

    >Ron Rosenfeld wrote...
    >...
    >>Having read that some of your data may optionally include decimal numbers, I
    >>would change my previous recommendation to:
    >>
    >>Number:
    >>
    >>=REGEX.MID(A1,"[0-9]+(\.[0-9]+)?")
    >>
    >>This assumes that all decimal digits have a number before the decimal point.
    >>If that is not the case, it can be easily modified.
    >>
    >>Units:
    >>
    >>=REGEX.MID(A1,"[^0-9. ]+")
    >>
    >>This assumes there are not "dots" in the units, as in your examples. Again, if
    >>there are, the expression can be modified to include them.

    >
    >It gets more difficult if the numbers could include 0. and .0 and if
    >the units could include periods and numerals, e.g., 20 reams A4, 5
    >boxes .22 shells.
    >
    >With just built-in formulas referencing the original record in A1 plus
    >the name seq as I defined it earlier,
    >
    >A3:
    >=LEFT(A1,LOOKUP(2,1/ISNUMBER(-MID(A1,1,seq)),seq))
    >
    >B3:
    >=MID($A1,SUMPRODUCT(LEN($A3:A3))+COLUMNS($A3:B3)/2,
    >FIND(",",$A1&",",SUMPRODUCT(LEN($A3:A3))+COLUMNS($A3:B3)/2)
    >-SUMPRODUCT(LEN($A3:A3))-COLUMNS($A3:B3)/2)
    >
    >C3:
    >=MID($A1,SUMPRODUCT(LEN($A3:B3))+1+COLUMNS($A3:B3)/2,
    >LOOKUP(2,1/ISNUMBER(-MID($A1,SUMPRODUCT(LEN($A3:B3))+1
    >+COLUMNS($A3:B3)/2,seq)),seq))
    >
    >Copy B3 and paste into D3, then copy C3:D3 and paste into E3:F3, G3:H3,
    >etc. Note that the number portions are text in the formulas above.
    >
    >With MOREFUNC.XLL, divide and conquer.
    >
    >A3:
    >=--REGEX.MID(REGEX.MID($A1,"[^,]+",(COLUMNS($A3:A3)+1)/2),"\d+(\.\d*)?|\.\d+")
    >
    >B3:
    >=REGEX.SUBSTITUTE(REGEX.MID($A1,"[^,]+",COLUMNS($A3:B3)/2),
    >" *"&REPLACE(A3,2,0,IF(LEFT(A3,1)="0","?",""))&" *","")
    >
    >Copy A3:B3 and paste into C3:D3, E3:F3, etc.
    >
    >Note: I'm assuming the OP's records are in one cell given the OP's
    >statement: "one cell could be 5#, 10Tins, 4 lb, 100Pcs, etc." I'm
    >interpretting the 'etc.' to mean there could be more fields in the
    >record, not that each of these be in a separate column.


    Yes, the "best" solution often depends critically on the nature of the data
    being analyzed. You've got a lot more experience than I on generalizing from
    incomplete data specifications. That's very useful in these NG's.


    --ron

  13. #13
    frosterrj
    Guest

    Re: String parsing with variable lenght strings

    I was using this suggestion from above:
    =IF(ISERR(VALUE(LEFT(A1,2))),VALUE(LEFT(A1,1)),IF(ISERR(VALUE(LEFT(A1,3))),VALUE(LEFT(A1,2)),IF(ISERR(VALUE(LEFT(A1,4))),VALUE(LEFT(A1,3)),IF(ISERR(VALUE(LEFT(A1,5))),VALUE(LEFT(A1,4)),"Check
    num length"))))

    I think I misunderstood what you were telling me below (too late in the day
    for really complex formulas!).

    Robert

    "Harlan Grove" wrote:

    > frosterrj wrote...
    > >Actually, I am having the most trouble because the numbers can contain
    > >decimals:
    > >7.63 lb, 66.5 oz etc.
    > >Seems to be stopping at the '.' .

    >
    > What is? Rather, which formulas are? The one I proposed below allow for
    > decimal fractions.
    >
    > >"Harlan Grove" wrote:
    > >> frosterrj wrote...

    > ....
    > >>>For example, one cell could be 5#, 10Tins, 4 lb, 100Pcs, etc.
    > >>>I would like to be able to pull out the #, Tins, lb, and Pcs in to their
    > >>>own column.

    > .....
    >
    > And with the following modified string in cell A1
    >
    > 5.5#, 10Tins, 4 lb, 100.23Pcs
    >
    > and the following defined name
    >
    > >>If so, then define a name like seq referring to
    > >>
    > >>=ROW(INDEX(Sheet1!$A:$A,1):INDEX(Sheet1!$A:$A,256))

    > ....
    >
    > And the following intermediate formula
    >
    > >>A2:
    > >>=SUBSTITUTE(A1," ","")

    >
    > The following formulas
    >
    > >>A3:
    > >>=--LEFT(TRIM(A2),LOOKUP(2,1/ISNUMBER(-MID(A2,1,seq)),seq))
    > >>
    > >>B3:
    > >>=MID($A2,SUMPRODUCT(LEN($A3:A3))+COUNT($A3:A3),FIND(",",$A2&",",
    > >>SUMPRODUCT(LEN($A3:A3))+COUNT($A3:A3))-SUMPRODUCT(LEN($A3:A3))
    > >>-COUNT($A3:A3))
    > >>
    > >>C3:
    > >>=--MID($A2,SUMPRODUCT(LEN($A3:B3))+COUNT($A3:B3,1),
    > >>LOOKUP(2,1/ISNUMBER(-MID($A2,SUMPRODUCT(LEN($A3:B3))
    > >>+COUNT($A3:B3,1),seq)),seq))
    > >>
    > >>Copy B3 and paste into D3, then copy C3:D3 and paste into E3:F3, G3:H3,
    > >>etc.

    > ....
    >
    > produce the following results (A3:H3).
    >
    > 5.5 # 10 Tins 4 lb 100.23 Pcs
    >
    > And the following formulas
    >
    > >>A3:
    > >>=--REGEX.MID($A1,"\d*\.?\d+",(COLUMNS($A3:A3)+1)/2)
    > >>
    > >>B3:
    > >>=REGEX.SUBSTITUTE($A1,"([^,]+, *){"&(COLUMNS($A4:B4)/2-1)&
    > >>"}\d*\.?\d+ *([^,]+).*","[2]")
    > >>
    > >>Copy A3:B3 and paste into C3:D3, E3:F3, G3:H3, etc.

    >
    > produce the same,
    >
    > 5.5 # 10 Tins 4 lb 100.23 Pcs
    >
    > Doesn't appear to stop at decimal fractions. Or are you referring to
    > some of the other responses that took your one sample record as
    > exhaustively indicative of your data?
    >
    >


  14. #14
    frosterrj
    Guest

    Re: String parsing with variable lenght strings

    Nope, my data is just one type per cell:
    9.5 OZ
    30#
    #10 TIN
    1 GAL
    38#
    the basic regex.mid works.

    Thanks for the pointers to the morefunc.xls. Got it an am using now.

    Robert

    "Harlan Grove" wrote:

    > Ron Rosenfeld wrote...
    > ....
    > >Having read that some of your data may optionally include decimal numbers, I
    > >would change my previous recommendation to:
    > >
    > >Number:
    > >
    > >=REGEX.MID(A1,"[0-9]+(\.[0-9]+)?")
    > >
    > >This assumes that all decimal digits have a number before the decimal point.
    > >If that is not the case, it can be easily modified.
    > >
    > >Units:
    > >
    > >=REGEX.MID(A1,"[^0-9. ]+")
    > >
    > >This assumes there are not "dots" in the units, as in your examples. Again, if
    > >there are, the expression can be modified to include them.

    >
    > It gets more difficult if the numbers could include 0. and .0 and if
    > the units could include periods and numerals, e.g., 20 reams A4, 5
    > boxes .22 shells.
    >
    > With just built-in formulas referencing the original record in A1 plus
    > the name seq as I defined it earlier,
    >
    > A3:
    > =LEFT(A1,LOOKUP(2,1/ISNUMBER(-MID(A1,1,seq)),seq))
    >
    > B3:
    > =MID($A1,SUMPRODUCT(LEN($A3:A3))+COLUMNS($A3:B3)/2,
    > FIND(",",$A1&",",SUMPRODUCT(LEN($A3:A3))+COLUMNS($A3:B3)/2)
    > -SUMPRODUCT(LEN($A3:A3))-COLUMNS($A3:B3)/2)
    >
    > C3:
    > =MID($A1,SUMPRODUCT(LEN($A3:B3))+1+COLUMNS($A3:B3)/2,
    > LOOKUP(2,1/ISNUMBER(-MID($A1,SUMPRODUCT(LEN($A3:B3))+1
    > +COLUMNS($A3:B3)/2,seq)),seq))
    >
    > Copy B3 and paste into D3, then copy C3:D3 and paste into E3:F3, G3:H3,
    > etc. Note that the number portions are text in the formulas above.
    >
    > With MOREFUNC.XLL, divide and conquer.
    >
    > A3:
    > =--REGEX.MID(REGEX.MID($A1,"[^,]+",(COLUMNS($A3:A3)+1)/2),"\d+(\.\d*)?|\.\d+")
    >
    > B3:
    > =REGEX.SUBSTITUTE(REGEX.MID($A1,"[^,]+",COLUMNS($A3:B3)/2),
    > " *"&REPLACE(A3,2,0,IF(LEFT(A3,1)="0","?",""))&" *","")
    >
    > Copy A3:B3 and paste into C3:D3, E3:F3, etc.
    >
    > Note: I'm assuming the OP's records are in one cell given the OP's
    > statement: "one cell could be 5#, 10Tins, 4 lb, 100Pcs, etc." I'm
    > interpretting the 'etc.' to mean there could be more fields in the
    > record, not that each of these be in a separate column.
    >
    >


  15. #15
    Ron Rosenfeld
    Guest

    Re: String parsing with variable lenght strings

    On Fri, 31 Mar 2006 13:01:03 -0800, frosterrj
    <[email protected]> wrote:

    >Nope, my data is just one type per cell:
    >9.5 OZ
    >30#
    >#10 TIN
    >1 GAL
    >38#
    >the basic regex.mid works.
    >
    >Thanks for the pointers to the morefunc.xls. Got it an am using now.
    >
    >Robert


    Glad you've got it working.

    By the way, if you need to distribute your workbook, and if you installed
    morefunc using the suggested defaults, there is an option on the Tools Menu to
    save morefunc with the workbook.


    --ron

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1